在远程 MSK kafka 集群上使用 kafka 连接 mongoDB debezium 源连接器

Question

I want to read data from MongoDB into Kafka's topic.我想将 MongoDB 中的数据读入 Kafka 的主题。 I managed to get this work locally by using the following connector properties file:我设法通过使用以下连接器属性文件在本地完成这项工作：

name=mongodb-source-connectorszes
connector.class=io.debezium.connector.mongodb.MongoDbConnector
mongodb.hosts=test/localhost:27017
database.history.kafka.bootstrap.servers=kafka:9092
mongodb.name=mongo_conn
database.whitelist=test
initial.sync.max.threads=1
tasks.max=1

the connect worker has the following conf:连接工作器具有以下配置：

# The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into Kafka
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# Converter-specific settings can be passed in by prefixing the Converter's setting with the converter we want to apply
# it to
key.converter.schemas.enable=true
value.converter.schemas.enable=true

offset.storage.file.filename=/tmp/connect.offsets
# Flush much faster than normal, which is useful for testing/debugging
offset.flush.interval.ms=10000


zookeeper.connect=localhost:2181

rest.port=18083

# Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
# (connectors, converters, transformations). The list should consist of top level directories that include 
# any combination of: 
# a) directories immediately containing jars with plugins and their dependencies
# b) uber-jars with plugins and their dependencies
# c) directories immediately containing the package directory structure of classes of plugins and their dependencies
# Note: symlinks will be followed to discover dependencies or plugins.
# Examples: 
# plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
plugin.path=/usr/share/java/test

internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
bootstrap.servers=localhost:9092

This works flawlessly in my local kafka.这在我当地的 kafka 中完美无缺。 I want to run it on a remote MSK Kafka cluster.我想在远程 MSK Kafka 集群上运行它。 As there is no built-in support of new kafka connect plugins within kafka MSK , I am facing difficulties to make my kafka connect source mongo plugin works, to export the connector from my local machine , I brought the following modifications: At the connector properties level :由于 kafka MSK 中没有对新 kafka 连接插件的内置支持，我面临着使我的 kafka 连接源 mongo 插件工作的困难，要从我的本地机器导出连接器，我进行了以下修改：在连接器属性等级：

name=mongodb-source-connectorszes
    connector.class=io.debezium.connector.mongodb.MongoDbConnector
    mongodb.hosts=test/localhost:27017  #keeping the same local mongo
    database.history.kafka.bootstrap.servers=remote-msk-kakfa-brokers:9092
    mongodb.name=mongo_conn
    database.whitelist=test
    initial.sync.max.threads=1
    tasks.max=1

at the connect worker level, I brought the following modifications:在连接工作者级别，我带来了以下修改：

# The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into Kafka
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# Converter-specific settings can be passed in by prefixing the Converter's setting with the converter we want to apply
# it to
key.converter.schemas.enable=true
value.converter.schemas.enable=true

offset.storage.file.filename=/tmp/connect.offsets
# Flush much faster than normal, which is useful for testing/debugging
offset.flush.interval.ms=10000


zookeeper.connect=remote-msk-kakfa-zookeeper:9092:2181

rest.port=18083

# Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
# (connectors, converters, transformations). The list should consist of top level directories that include 
# any combination of: 
# a) directories immediately containing jars with plugins and their dependencies
# b) uber-jars with plugins and their dependencies
# c) directories immediately containing the package directory structure of classes of plugins and their dependencies
# Note: symlinks will be followed to discover dependencies or plugins.
# Examples: 
# plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
plugin.path=/usr/share/java/test

internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
bootstrap.servers=remote-msk-kakfa-brokers:9092:9092

but seems that this is not enough as I am getting the following error:但似乎这还不够，因为我收到以下错误：

[2020-01-31 11:58:01,619] WARN [Producer clientId=producer-1] Error while fetching metadata with correlation id 118 : {mongo_conn.test.docs=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient:1031)
[2020-01-31 11:58:01,731] WARN [Producer clientId=producer-1] Error while fetching metadata with correlation id 119 : {mongo_conn.test.docs=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient:1031)

Usually, I manage to request the Kafka MSK cluster from my local machine ( via the use of a VPN ,and sshuttle to EC2 instance) .通常，我设法从我的本地机器请求 Kafka MSK 集群（通过使用 VPN 和 sshuttle 到 EC2 实例）。 for example, to list topics in the remote kafka msk cluster.例如，列出远程 kafka msk 集群中的主题。 I just have to do:我只需要这样做：

bin/kafka-topics.sh --list --zookeeper  remote-zookeeper-server:2181

by going to my local kafka installation folder.通过转到我的本地 kafka 安装文件夹。

and this cmmand works perfectly , without changing server.properties in my local machine.这个命令完美运行，无需更改我本地机器上的 server.properties。 Any idea how to solve this in order to export the kafka Debezium mongo Source to kafka MSK cluster.知道如何解决这个问题以便将 kafka Debezium mongo Source 导出到 kafka MSK 集群。

Answer 1

It's recommended to use connect-distributed script and properties for running Connect/Debezium建议使用连接分布式脚本和属性来运行 Connect/Debezium

Anything that says zookeeper.connect should be removed (only Kafka brokers use that).任何说 zookeeper.connect 的东西都应该被删除（只有 Kafka 经纪人使用它）。 Anything that says bootstrap servers should point at the address MSK gives you.任何说明引导服务器应指向 MSK 提供给您的地址的内容。

If you're getting connection errors, make sure you check firewall / VPC settings如果您遇到连接错误，请确保检查防火墙/VPC 设置

在远程 MSK kafka 集群上使用 kafka 连接 mongoDB debezium 源连接器

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-01-31 13:49:15

在远程 MSK kafka 集群上使用 kafka 连接 mongoDB debezium 源连接器

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-01-31 13:49:15

解决方案1
1 已采纳 2020-01-31 13:49:15