多个 collections mongodb 到 Kafka 主题

Question

The application writes data every month to a new collection (for example, journal_2205, journal_2206).应用程序每月将数据写入一个新集合（例如，journal_2205、journal_2206）。 Is it possible to configure the connector so that it reads the oplog from the new collection and writes to one topic?是否可以配置连接器，使其从新集合中读取 oplog 并写入一个主题？ I use the connector https://www.mongodb.com/docs/kafka-connector/current/source-connector/ Thank you!我使用连接器https://www.mongodb.com/docs/kafka-connector/current/source-connector/谢谢！

Answer 1

Yes, this is possible, you can listen to multiple change streams from multiple mongo collections.是的，这是可能的，您可以监听来自多个 mongo collections 的多个更改流。 You just need to provide the Regex for the collection names in pipeline , you can even provide the Regex for database names if you have multiple databases.您只需要为pipeline中的集合名称提供正则表达式，如果您有多个数据库，您甚至可以为数据库名称提供正则表达式。

"pipeline": "[{\"$match\":{\"$and\":[{\"ns.db\":{\"$regex\":/^database-name$/}},{\"ns.coll\":{\"$regex\":/^journal_.*/}}]}}]"

You can even exclude any given database using $nin , which you dont want to listen for any change-stream.您甚至可以使用$nin排除任何给定的数据库，您不想监听任何更改流。

"pipeline": "[{\"$match\":{\"$and\":[{\"ns.db\":{\"$regex\":/^database-name$/,\"$nin\":[/^any_database_name$/]}},{\"ns.coll\":{\"$regex\":/^journal_.*/}}]}}]"

Here is the complete Kafka connector configuration.这是完整的 Kafka 连接器配置。

Mongo to Kafka source connector Mongo 到 Kafka 源连接器

{
  "name": "mongo-to-kafka-connect",
  "config": {
    "connector.class": "com.mongodb.kafka.connect.MongoSourceConnector",
    "publish.full.document.only": "true",
    "tasks.max": "3",
    "key.converter.schemas.enable": "false",
    "topic.creation.enable": "true",
    "poll.await.time.ms": 1000,
    "poll.max.batch.size": 100,
    "topic.prefix": "any prefix for topic name",
    "output.json.formatter": "com.mongodb.kafka.connect.source.json.formatter.SimplifiedJson",
    "connection.uri": "mongodb://<username>:<password>@ip:27017,ip:27017,ip:27017,ip:27017/?authSource=admin&replicaSet=xyz&tls=true",
    "value.converter.schemas.enable": "false",
    "copy.existing": "true",
    "topic.creation.default.replication.factor": 3,
    "topic.creation.default.partitions": 3,
    "topic.creation.compacted.cleanup.policy": "compact",
    "value.converter": "org.apache.kafka.connect.storage.StringConverter",
    "key.converter": "org.apache.kafka.connect.storage.StringConverter",
    "mongo.errors.log.enable": "true",
    "heartbeat.interval.ms": 10000,
    "pipeline": "[{\"$match\":{\"$and\":[{\"ns.db\":{\"$regex\":/^database-name$/}},{\"ns.coll\":{\"$regex\":/^journal_.*/}}]}}]"
  }
}

You can get more details from official docs.您可以从官方文档中获得更多详细信息。

多个 collections mongodb 到 Kafka 主题

问题描述

1 个解决方案

解决方案1
0 2022-08-11 14:08:16

多个 collections mongodb 到 Kafka 主题

问题描述

1 个解决方案

解决方案1 0 2022-08-11 14:08:16

解决方案1
0 2022-08-11 14:08:16