简体   繁体   English

如何使用 Debezium MongoDB 源连接器将 JSON 值转换为 Kafka 消息密钥?

[英]How to transform JSON value to Kafka message key with Debezium MongoDB Source Connector?

I'm using the Debezium MongoDB Connector to listen to a specific MongoDB collection to have each entry as a message in a kafka topic.我正在使用 Debezium MongoDB 连接器来收听特定的 MongoDB 集合,以将每个条目作为 kafka 主题中的消息。 This works fine with the following kafka connect configuration:这适用于以下 kafka 连接配置:

{
  "name": "mongo-source-connector",
  "config": {
    "connector.class": "io.debezium.connector.mongodb.MongoDbConnector",
    "mongodb.hosts": "192.168.0.151:27017",
    "mongodb.name": "mongo",
    "database.whitelist": "database",
    "tasks.max": 1,
    "max.batch.size": 2048,
    "poll.interval.ms": 5000,
    "collection.whitelist": "database.collection"
  }
}

With this configuration each Kafka message has the id of the original data record from the MongoDB.使用此配置,每个 Kafka 消息都具有来自 MongoDB 的原始数据记录的 ID。 Now I'm trying to achieve a key transformation to get a specific value from a field inside the JSON document as message key in kafka.现在我正在尝试实现键转换,以从 JSON 文档中的字段中获取特定值作为 kafka 中的消息键。 The reason for this is that the data should be partitioned using this field.这样做的原因是应该使用该字段对数据进行分区。

I already tried the following config for creating a key:我已经尝试了以下配置来创建密钥:

{
  "name": "mongo-source-connector",
  "config": {
    "connector.class": "io.debezium.connector.mongodb.MongoDbConnector",
    "mongodb.hosts": "192.168.0.151:27017",
    "mongodb.name": "mongo",
    "database.whitelist": "database",
    "tasks.max": 1,
    "max.batch.size": 2048,
    "poll.interval.ms": 5000,
    "collection.whitelist": "database.collection",
    "transforms":"createKey",
    "transforms.createKey.type": "org.apache.kafka.connect.transforms.ValueToKey", 
    "transforms.createKey.fields": "specific-field-in-mongodb-source-record"
  }
}

Then I only receive this error in Kafka Connect:然后我只在 Kafka Connect 中收到此错误:

[2019-10-10 11:35:44,049] INFO 2048 records sent for replica set 'dev-shard-01', last offset: {sec=1570707340, ord=1, initsync=true, h=-8774414475389548112} (io.debezium.connector.mongodb.MongoDbConnectorTask)
[2019-10-10 11:35:44,050] INFO WorkerSourceTask{id=mongo-source-connector-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSourceTask)
[2019-10-10 11:35:44,050] INFO WorkerSourceTask{id=mongo-source-connector-0} flushing 0 outstanding messages for offset commit (org.apache.kafka.connect.runtime.WorkerSourceTask)
[2019-10-10 11:35:44,050] ERROR WorkerSourceTask{id=mongo-source-connector-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask)
java.lang.NullPointerException
        at org.apache.kafka.connect.transforms.ValueToKey.applyWithSchema(ValueToKey.java:85)
        at org.apache.kafka.connect.transforms.ValueToKey.apply(ValueToKey.java:65)
        at org.apache.kafka.connect.runtime.TransformationChain.apply(TransformationChain.java:38)
        at org.apache.kafka.connect.runtime.WorkerSourceTask.sendRecords(WorkerSourceTask.java:218)
        at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:194)
        at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
        at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
[2019-10-10 11:35:44,050] ERROR WorkerSourceTask{id=mongo-source-connector-0} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask)

Another configuration I tried is the following:我尝试的另一个配置如下:

{
  "name": "mongo-source-connector",
  "config": {
    "connector.class": "io.debezium.connector.mongodb.MongoDbConnector",
    "mongodb.hosts": "192.168.0.151:27017",
    "mongodb.name": "mongo",
    "database.whitelist": "database",
    "tasks.max": 1,
    "max.batch.size": 2048,
    "poll.interval.ms": 5000,
    "collection.whitelist": "database.collection",
    "transforms": "unwrap,insertKey,extractKey",
    "transforms.unwrap.type": "io.debezium.transforms.UnwrapFromEnvelope",
    "transforms.unwrap.drop.tombstones": "false",
    "transforms.insertKey.type": "org.apache.kafka.connect.transforms.ValueToKey",
    "transforms.insertKey.fields": "specific-field-in-mongodb-source-record",
    "transforms.extractKey.type": "org.apache.kafka.connect.transforms.ExtractField$Key",
    "transforms.extractKey.field": "specific-field-in-mongodb-source-record",
    "key.converter": "org.apache.kafka.connect.storage.StringConverter",
    "key.converter.schemas.enable": "true",
    "value.converter": "org.apache.kafka.connect.json.JsonConverter",
    "value.converter.schemas.enable": "false"
  }
}

This also leads to an error:这也会导致错误:

[2019-10-10 12:27:04,915] ERROR WorkerSourceTask{id=mongo-source-connector-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask)
org.apache.kafka.connect.errors.DataException: Only Struct objects supported for [copying fields from value to key], found: java.lang.String
        at org.apache.kafka.connect.transforms.util.Requirements.requireStruct(Requirements.java:52)
        at org.apache.kafka.connect.transforms.ValueToKey.applyWithSchema(ValueToKey.java:79)
        at org.apache.kafka.connect.transforms.ValueToKey.apply(ValueToKey.java:65)
        at org.apache.kafka.connect.runtime.TransformationChain.apply(TransformationChain.java:38)
        at org.apache.kafka.connect.runtime.WorkerSourceTask.sendRecords(WorkerSourceTask.java:218)
        at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:194)
        at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
        at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
[2019-10-10 12:27:04,915] ERROR WorkerSourceTask{id=mongo-source-connector-0} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask)

Does anyone know if and how I can transform an element of the JSON document out of the MongoDB into the Kafka message key?有谁知道我是否以及如何将 JSON 文档中的一个元素从 MongoDB 转换为 Kafka 消息密钥?

Thanks!谢谢!

After some more testing, I found a suitable solution.经过更多的测试,我找到了一个合适的解决方案。 I turns out that I don't need the third transformation.事实证明,我不需要第三次转换。 It's enough to just use the ValueToKey transformation.只需使用 ValueToKey 转换就足够了。

For the sake of completeness, here is the working configuration:为了完整起见,这里是工作配置:

{
  "name": "mongo-source-connector",
  "config": {
    "connector.class": "io.debezium.connector.mongodb.MongoDbConnector",
    "mongodb.hosts": "192.168.0.151:27017",
    "mongodb.name": "mongo",
    "database.whitelist": "database",
    "tasks.max": 1,
    "max.batch.size": 2048,
    "poll.interval.ms": 5000,
    "collection.whitelist": "database.collection",
    "transforms": "unwrap,insertKey",
    "transforms.unwrap.type": "io.debezium.connector.mongodb.transforms.UnwrapFromMongoDbEnvelope",
    "transforms.unwrap.drop.tombstones": "false",
    "transforms.unwrap.delete.handling.mode":"drop",
    "transforms.unwrap.operation.header":"true",
    "transforms.insertKey.type": "org.apache.kafka.connect.transforms.ValueToKey",
    "transforms.insertKey.fields": "specific-field-in-mongodb-source-record",
    "key.converter": "org.apache.kafka.connect.storage.StringConverter",
    "key.converter.schemas.enable": "false",
    "value.converter": "org.apache.kafka.connect.json.JsonConverter",
    "value.converter.schemas.enable": "false"
  }
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Debezium Kafka连接器mongodb - Debezium Kafka connector mongodb kafka 连接器 debezium mongodb CDC 更新/$set 消息没有过滤器(_id 值) - kafka connector debezium mongodb CDC update/$set message without filter(_id value) MONGODB KAFKA Source CONNECTOR 如何更改 Kafka 分区键 - MONGODB KAFKA Source CONNECTOR How to change Kafka Parition key 如何根据 kafka 主题名称或消息键/值使用 mongodb 接收器连接器对不同 dbs 和 collections 中的 kafka 主题进行分组 - How to group kafka topics in different dbs and collections with mongodb sink connector depending on kafka topic name or message key/value 如何重命名 debezium mongodb 连接器发件箱消息的 id 标头 - How to rename the id header of a debezium mongodb connector outbox message 如何配置 Debezium 的 MongoDB 源连接器以按照 Postgres JDBC 接收器连接器的预期发送 record_value 中的 pk 字段 - How can I configure Debezium's MongoDB source connector to send the pk fields in the record_value as expected by the Postgres JDBC sink connector 如何在spring-boot中和之前为mongoDB编写Debezium Connector Configuration和Debezium Listener,在payload作为JSON之后? - How to write Debezium Connector Configuration and Debezium Listener for mongoDB in spring-boot and before, after payload as JSON? 如何在创建 debezium mongodb kafka 连接器时通过 MongoDB tls 证书? - how to pass MongoDB tls certificates while creating debezium mongodb kafka connector? Debezium mongodb kafka 连接器没有像 mongodb 那样在主题中产生一些记录 - Debezium mongodb kafka connector not producing some of records in topic as it is in mongodb Kafka 消息包含控制字符(MongoDB 源连接器) - Kafka message includes control characters (MongoDB Source Connector)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM