简体   繁体   English

在远程 MSK kafka 集群上使用 kafka 连接 mongoDB debezium 源连接器

[英]use kafka connect mongoDB debezium souce connector on a remote MSK kafka cluster

I want to read data from MongoDB into Kafka's topic.我想将 MongoDB 中的数据读入 Kafka 的主题。 I managed to get this work locally by using the following connector properties file:我设法通过使用以下连接器属性文件在本地完成这项工作:

name=mongodb-source-connectorszes
connector.class=io.debezium.connector.mongodb.MongoDbConnector
mongodb.hosts=test/localhost:27017
database.history.kafka.bootstrap.servers=kafka:9092
mongodb.name=mongo_conn
database.whitelist=test
initial.sync.max.threads=1
tasks.max=1

the connect worker has the following conf:连接工作器具有以下配置:

# The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into Kafka
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# Converter-specific settings can be passed in by prefixing the Converter's setting with the converter we want to apply
# it to
key.converter.schemas.enable=true
value.converter.schemas.enable=true

offset.storage.file.filename=/tmp/connect.offsets
# Flush much faster than normal, which is useful for testing/debugging
offset.flush.interval.ms=10000


zookeeper.connect=localhost:2181

rest.port=18083

# Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
# (connectors, converters, transformations). The list should consist of top level directories that include 
# any combination of: 
# a) directories immediately containing jars with plugins and their dependencies
# b) uber-jars with plugins and their dependencies
# c) directories immediately containing the package directory structure of classes of plugins and their dependencies
# Note: symlinks will be followed to discover dependencies or plugins.
# Examples: 
# plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
plugin.path=/usr/share/java/test

internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
bootstrap.servers=localhost:9092

This works flawlessly in my local kafka.这在我当地的 kafka 中完美无缺。 I want to run it on a remote MSK Kafka cluster.我想在远程 MSK Kafka 集群上运行它。 As there is no built-in support of new kafka connect plugins within kafka MSK , I am facing difficulties to make my kafka connect source mongo plugin works, to export the connector from my local machine , I brought the following modifications: At the connector properties level :由于 kafka MSK 中没有对新 kafka 连接插件的内置支持,我面临着使我的 kafka 连接源 mongo 插件工作的困难,要从我的本地机器导出连接器,我进行了以下修改:在连接器属性等级 :

name=mongodb-source-connectorszes
    connector.class=io.debezium.connector.mongodb.MongoDbConnector
    mongodb.hosts=test/localhost:27017  #keeping the same local mongo
    database.history.kafka.bootstrap.servers=remote-msk-kakfa-brokers:9092
    mongodb.name=mongo_conn
    database.whitelist=test
    initial.sync.max.threads=1
    tasks.max=1

at the connect worker level, I brought the following modifications:在连接工作者级别,我带来了以下修改:

# The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into Kafka
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# Converter-specific settings can be passed in by prefixing the Converter's setting with the converter we want to apply
# it to
key.converter.schemas.enable=true
value.converter.schemas.enable=true

offset.storage.file.filename=/tmp/connect.offsets
# Flush much faster than normal, which is useful for testing/debugging
offset.flush.interval.ms=10000


zookeeper.connect=remote-msk-kakfa-zookeeper:9092:2181

rest.port=18083

# Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
# (connectors, converters, transformations). The list should consist of top level directories that include 
# any combination of: 
# a) directories immediately containing jars with plugins and their dependencies
# b) uber-jars with plugins and their dependencies
# c) directories immediately containing the package directory structure of classes of plugins and their dependencies
# Note: symlinks will be followed to discover dependencies or plugins.
# Examples: 
# plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
plugin.path=/usr/share/java/test

internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
bootstrap.servers=remote-msk-kakfa-brokers:9092:9092

but seems that this is not enough as I am getting the following error:但似乎这还不够,因为我收到以下错误:

[2020-01-31 11:58:01,619] WARN [Producer clientId=producer-1] Error while fetching metadata with correlation id 118 : {mongo_conn.test.docs=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient:1031)
[2020-01-31 11:58:01,731] WARN [Producer clientId=producer-1] Error while fetching metadata with correlation id 119 : {mongo_conn.test.docs=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient:1031)

Usually, I manage to request the Kafka MSK cluster from my local machine ( via the use of a VPN ,and sshuttle to EC2 instance) .通常,我设法从我的本地机器请求 Kafka MSK 集群(通过使用 VPN 和 sshuttle 到 EC2 实例)。 for example, to list topics in the remote kafka msk cluster.例如,列出远程 kafka msk 集群中的主题。 I just have to do:我只需要这样做:

bin/kafka-topics.sh --list --zookeeper  remote-zookeeper-server:2181

by going to my local kafka installation folder.通过转到我的本地 kafka 安装文件夹。

and this cmmand works perfectly , without changing server.properties in my local machine.这个命令完美运行,无需更改我本地机器上的 server.properties。 Any idea how to solve this in order to export the kafka Debezium mongo Source to kafka MSK cluster.知道如何解决这个问题以便将 kafka Debezium mongo Source 导出到 kafka MSK 集群。

It's recommended to use connect-distributed script and properties for running Connect/Debezium建议使用连接分布式脚本和属性来运行 Connect/Debezium

Anything that says zookeeper.connect should be removed (only Kafka brokers use that).任何说 zookeeper.connect 的东西都应该被删除(只有 Kafka 经纪人使用它)。 Anything that says bootstrap servers should point at the address MSK gives you.任何说明引导服务器应指向 MSK 提供给您的地址的内容。

If you're getting connection errors, make sure you check firewall / VPC settings如果您遇到连接错误,请确保检查防火墙/VPC 设置

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Debezium Kafka连接器mongodb - Debezium Kafka connector mongodb Debezium MongoDB 连接器错误:org.apache.kafka.connect.errors.ConnectException:错误处理程序中超出容差 - Debezium MongoDB Connector Error: org.apache.kafka.connect.errors.ConnectException: Tolerance exceeded in error handler Debezium mongodb kafka 连接器没有像 mongodb 那样在主题中产生一些记录 - Debezium mongodb kafka connector not producing some of records in topic as it is in mongodb 卡夫卡连接 - Mongodb 接收器连接器 - Kafka connect- Mongodb Sink Connector MongoDB Kafka Connect - 接收器连接器更新失败 - MongoDB Kafka Connect - Sink connector failing on updates Kafka Connect MongoDB Source Connector 故障场景 - Kafka Connect MongoDB Source Connector failure scenario 如何为在 kubernetes 集群上运行的 Kafka Connect 配置 MongoDB 官方源连接器 - How to configure MongoDB official source connector for Kafka Connect running on a kubernetes cluster 如何将已设置的 kafka 集群连接到 mongodb? - How to connect already setup kafka cluster to mongodb? 将多个 collections 与 MongoDB Kafka 连接器一起使用 - Use multiple collections with MongoDB Kafka Connector 如何在创建 debezium mongodb kafka 连接器时通过 MongoDB tls 证书? - how to pass MongoDB tls certificates while creating debezium mongodb kafka connector?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM