-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ctk load table
: Add support for MongoDB Change Streams
- Loading branch information
Showing
8 changed files
with
336 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,268 @@ | ||
(mongodb-cdc-relay)= | ||
# MongoDB CDC Relay | ||
|
||
## About | ||
Relay a [MongoDB Change Stream] into a [CrateDB] table using a one-stop command | ||
`ctk load table mongodb+cdc://...`, or `mongodb+srv+cdc://` for MongoDB Atlas. | ||
|
||
You can use it in order to facilitate convenient data transfers to be used | ||
within data pipelines or ad hoc operations. It can be used as a CLI interface, | ||
and as a library. | ||
|
||
|
||
## Install | ||
```shell | ||
pip install --upgrade 'cratedb-toolkit[mongodb]' | ||
``` | ||
|
||
:::{tip} | ||
The tutorial also uses the programs `crash`, `mongosh`, and `atlas`. `crash` | ||
will be installed with CrateDB Toolkit, but `mongosh` and `atlas` must be | ||
installed by other means. If you are using Docker anyway, please use those | ||
command aliases to provide them to your environment without actually needing | ||
to install them. | ||
|
||
```shell | ||
alias mongosh='docker run -i --rm --network=host mongo:7 mongosh' | ||
``` | ||
|
||
The `atlas` program needs to store authentication information between invocations, | ||
therefore you need to supply a storage volume. | ||
```shell | ||
mkdir atlas-config | ||
alias atlas='docker run --rm -it --volume=$(pwd)/atlas-config:/root mongodb/atlas atlas' | ||
``` | ||
::: | ||
|
||
|
||
## Usage | ||
|
||
(mongodb-cdc-workstation)= | ||
### Workstation | ||
The guidelines assume that both services, CrateDB and MongoDB, are listening on | ||
`localhost`. | ||
Please find guidelines how to provide them on your workstation using | ||
Docker or Podman in the {ref}`mongodb-cdc-services-standalone` section below. | ||
```shell | ||
export MONGODB_URL=mongodb://localhost/testdrive | ||
export MONGODB_URL_CTK=mongodb+cdc://localhost/testdrive/demo | ||
export CRATEDB_SQLALCHEMY_URL=crate://crate@localhost/testdrive/demo-cdc | ||
ctk load table "${MONGODB_URL_CTK}" | ||
``` | ||
|
||
Insert document into MongoDB collection, and update it. | ||
```shell | ||
mongosh "${MONGODB_URL}" --eval 'db.demo.insertOne({"foo": "bar"})' | ||
mongosh "${MONGODB_URL}" --eval 'db.demo.updateOne({"foo": "bar"}, { $set: { status: "D" } })' | ||
``` | ||
|
||
Query data in CrateDB. | ||
```shell | ||
crash --command 'SELECT * FROM "testdrive"."demo-cdc";' | ||
``` | ||
|
||
Invoke a delete operation, and check data in CrateDB once more. | ||
```shell | ||
mongosh "${MONGODB_URL}" --eval 'db.demo.deleteOne({"foo": "bar"})' | ||
crash --command 'SELECT * FROM "testdrive"."demo-cdc";' | ||
``` | ||
|
||
(mongodb-cdc-cloud)= | ||
### Cloud | ||
The guidelines assume usage of cloud variants for both services, CrateDB Cloud | ||
and MongoDB Atlas. | ||
Please find guidelines how to provision relevant cloud resources | ||
in the {ref}`mongodb-cdc-services-cloud` section below. | ||
|
||
:::{rubric} Invoke pipeline | ||
::: | ||
A canonical invocation for ingesting MongoDB Atlas Change Streams into | ||
CrateDB Cloud. | ||
|
||
```shell | ||
export MONGODB_URL=mongodb+srv://user:[email protected]/testdrive | ||
export MONGODB_URL_CTK=mongodb+srv+cdc://user:[email protected]/testdrive/demo | ||
export CRATEDB_HTTP_URL="https://admin:[email protected]:4200/" | ||
export CRATEDB_SQLALCHEMY_URL="crate://admin:[email protected]:4200/testdrive/demo-cdc?ssl=true" | ||
``` | ||
```shell | ||
ctk load table "${MONGODB_URL_CTK}" | ||
``` | ||
|
||
:::{note} | ||
Please note the `mongodb+srv://` and `mongodb+srv+cdc://` URL schemes, and the | ||
`ssl=true` query parameter. Both are needed to establish connectivity with | ||
MongoDB Atlas and CrateDB. | ||
::: | ||
|
||
:::{rubric} Trigger CDC events | ||
::: | ||
Inserting a document into the MongoDB collection, and updating it, will trigger two CDC events. | ||
```shell | ||
mongosh "${MONGODB_URL}" --eval 'db.demo.insertOne({"foo": "bar"})' | ||
mongosh "${MONGODB_URL}" --eval 'db.demo.updateOne({"foo": "bar"}, { $set: { status: "D" } })' | ||
``` | ||
|
||
:::{rubric} Query data in CrateDB | ||
::: | ||
```shell | ||
crash --hosts "${CRATEDB_HTTP_URL}" --command 'SELECT * FROM "testdrive"."demo-cdc";' | ||
``` | ||
|
||
|
||
## Appendix | ||
|
||
### Database Operations | ||
A few operations that are handy when exploring this exercise. | ||
|
||
Reset MongoDB collection. | ||
```shell | ||
mongosh "${MONGODB_URL}" --eval 'db.demo.drop()' | ||
``` | ||
|
||
Reset CrateDB table. | ||
```shell | ||
crash --command 'DELETE FROM "testdrive"."demo-cdc";' | ||
``` | ||
|
||
Display documents in MongoDB collection. | ||
```shell | ||
mongosh "${MONGODB_URL}" --eval 'db.demo.find()' | ||
``` | ||
|
||
(mongodb-cdc-services-standalone)= | ||
### Standalone Services | ||
Quickly start CrateDB and MongoDB using Docker or Podman. | ||
|
||
#### CrateDB | ||
Start CrateDB. | ||
```shell | ||
docker run --rm -it --name=cratedb --publish=4200:4200 --env=CRATE_HEAP_SIZE=2g \ | ||
crate:5.7 -Cdiscovery.type=single-node | ||
``` | ||
|
||
#### MongoDB | ||
Start MongoDB. | ||
Please note that change streams are only available for replica sets and | ||
sharded clusters, so let's define a replica set by using the | ||
`--replSet rs-testdrive` option when starting the MongoDB server. | ||
```shell | ||
docker run -it --rm --name=mongodb --publish=27017:27017 \ | ||
mongo:7 mongod --replSet rs-testdrive | ||
``` | ||
|
||
Now, initialize the replica set, by using the `mongosh` command to invoke | ||
the `rs.initiate()` operation. | ||
```shell | ||
export MONGODB_URL="mongodb://localhost/" | ||
docker run -i --rm --network=host mongo:7 mongosh ${MONGODB_URL} <<EOF | ||
config = { | ||
_id: "rs-testdrive", | ||
members: [{ _id : 0, host : "localhost:27017"}] | ||
}; | ||
rs.initiate(config); | ||
EOF | ||
``` | ||
|
||
|
||
(mongodb-cdc-services-cloud)= | ||
### Cloud Services | ||
Quickly provision [CrateDB Cloud] and [MongoDB Atlas]. | ||
|
||
#### CrateDB Cloud | ||
To provision a database cluster, use either the [croud CLI], or the | ||
[CrateDB Cloud Web Console]. | ||
|
||
Invoke CLI login. | ||
```shell | ||
croud login | ||
``` | ||
Create organization. | ||
```shell | ||
croud organizations create --name samplecroudorganization | ||
``` | ||
Create project. | ||
```shell | ||
croud projects create --name sampleproject | ||
``` | ||
Deploy cluster. | ||
```shell | ||
croud clusters deploy / | ||
--product-name crfree / | ||
--tier default / | ||
--cluster-name testdrive / | ||
--subscription-id 782dfc00-7b25-4f48-8381-b1b096dd1619 \ | ||
--project-id 952cd102-91c1-4837-962a-12ecb71a6ba8 \ | ||
--version 5.8.0 \ | ||
--username admin \ | ||
--password "as6da9ddasfaad7i902jcv780dmcba" | ||
``` | ||
|
||
Finally, create a "Database Access" user and use the credentials to populate | ||
`MONGODB_URL` and `MONGODB_URL_CTK` at {ref}`mongodb-cdc-workstation` properly. | ||
|
||
When shutting down your workbench, you may want to clean up any cloud resources | ||
you just used. | ||
```shell | ||
croud clusters delete --cluster-id CLUSTER_ID | ||
``` | ||
|
||
#### MongoDB Atlas | ||
To provision a database cluster, use either the [Atlas CLI], or the | ||
Atlas User Interface. | ||
|
||
Create an API key. | ||
```shell | ||
atlas projects apiKeys create --desc "Ondemand Testdrive" --role GROUP_OWNER | ||
``` | ||
```text | ||
API Key '889727cb5bfe8830d0f8a203' created. | ||
Public API Key bksttjep | ||
Private API Key 9f8c1c41-b5f7-4d2a-b1a0-a1d2ef457796 | ||
``` | ||
Enter authentication key information. | ||
```shell | ||
atlas config init | ||
``` | ||
Create database cluster. | ||
```shell | ||
atlas clusters create testdrive --provider AWS --region EU_CENTRAL_1 --tier M0 --tag env=dev | ||
``` | ||
Inquire connection string. | ||
```shell | ||
atlas clusters connectionStrings describe testdrive | ||
``` | ||
```text | ||
mongodb+srv://testdrive.jaxmmfp.mongodb.net | ||
``` | ||
|
||
Finally, create a "Database Access" user and use the credentials to populate | ||
`MONGODB_URL` and `MONGODB_URL_CTK` at {ref}`mongodb-cdc-cloud` properly. | ||
|
||
When shutting down your workbench, you may want to clean up any cloud resources | ||
you just used. | ||
```shell | ||
atlas clusters delete testdrive | ||
``` | ||
|
||
|
||
## Backlog | ||
:::{todo} | ||
- Improve UX/DX. | ||
- Provide `ctk shell`. | ||
- Provide [SDK and CLI for CrateDB Cloud Cluster APIs]. | ||
|
||
[SDK and CLI for CrateDB Cloud Cluster APIs]: https://github.com/crate-workbench/cratedb-toolkit/pull/81 | ||
::: | ||
|
||
|
||
[Atlas CLI]: https://www.mongodb.com/docs/atlas/cli/ | ||
[commons-codec]: https://pypi.org/project/commons-codec/ | ||
[CrateDB]: https://cratedb.com/docs/guide/home/ | ||
[CrateDB Cloud]: https://cratedb.com/docs/cloud/ | ||
[MongoDB Atlas]: https://www.mongodb.com/atlas | ||
[MongoDB Change Stream]: https://www.mongodb.com/docs/manual/changeStreams/ | ||
[croud CLI]: https://cratedb.com/docs/cloud/en/latest/tutorials/deploy/croud.html | ||
[CrateDB Cloud Web Console]: https://cratedb.com/docs/cloud/en/latest/tutorials/quick-start.html#deploy-cluster |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,9 @@ | ||
--- | ||
orphan: true | ||
--- | ||
|
||
(migr8)= | ||
# migr8 | ||
# migr8 migration utility | ||
|
||
## About | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters