简体   繁体   English

分布式官方 Mongodb Kafka Source Connector with Multiple tasks 不工作

[英]Distributed Official Mongodb Kafka Source Connector with Multiple tasks Not working

I am running Apache Kafka on my Windows machine with two Kafka-Connect-Workers(Port 8083, 8084) and one topic with three partitions(replication of one).我在我的 Windows 机器上运行 Apache Kafka,有两个 Kafka-Connect-Workers(端口 8083、8084)和一个带有三个分区的主题(复制一个)。 My issue is that I am able to see the fail-over to other Kafka-Connect worker whenever I shutdown one of them, but load balancing is not happening because the number of tasks is always ONE.我的问题是,每当我关闭其中一个 Kafka-Connect 工作人员时,我都能看到故障转移到其他工作人员,但由于任务数始终为 ONE,因此没有发生负载平衡。 I am using Official MongoDB-Kafka-Connector as Source(ChangeStream) with tasks.max=6.我使用官方 MongoDB-Kafka-Connector 作为 Source(ChangeStream),tasks.max=6。 I tried updating MongoDB with multiple threads so that it could push more data into Kafka-Connect and may perhaps make Kafka-Connect create more tasks.我尝试使用多个线程更新 MongoDB 以便它可以将更多数据推送到 Kafka-Connect 并可能使 Kafka-Connect 创建更多任务。 Even under higher volume of data, tasks count remain one.即使在更大的数据量下,任务计数仍然是一个。

How I confirmed only one task is running?我如何确认只有一项任务正在运行? That's through the api "http://localhost:8083/connectors/mongodb-connector/status": Response: { "name":"mongodb-connector", "connector": { "state":"RUNNING", "worker_id":"xx.xx.xx.xx:8083" } "tasks": [ { "id": 0, "state": "RUNNING" "worker_id": "xx.xx.xx.xx:8083" } ], "type": "source" } Am I missing something here?那是通过 api "http://localhost:8083/connectors/mongodb-connector/status": 响应: { "name":"mongodb-connector", "connector": { "state":"RUNNING", "worker_id":"xx.xx.xx.xx:8083" } "tasks": [ { "id": 0, "state": "RUNNING" "worker_id": "xx.xx.xx.xx:8083" } ], "type": "source" }我在这里遗漏了什么吗? Why more tasks are not created?为什么没有创建更多任务?

It seems this is the behavior of Official MongoDB Kafka Source Connector.看来这是官方 MongoDB Kafka Source Connector 的行为。 This is the answer I got on another forum from Ross Lawley(MongoDB developer):这是我在另一个论坛上从 Ross Lawley(MongoDB 开发人员)那里得到的答案:

Prior to 1.2.0 only a single task was supported by the sink connector.在 1.2.0 之前,接收器连接器仅支持单个任务。 The Source connector still only supports a single task, this is because it uses a single Change Stream cursor. Source 连接器仍然只支持单个任务,这是因为它使用单个 Change Stream cursor。 This is enough to watch and publish changes cluster wide, database wide or down to a single collection.这足以观察和发布集群范围、数据库范围或单个集合的更改。

I raised this ticket: https://jira.mongodb.org/browse/KAFKA-121 Got following response: The source connector will only ever produce a single task.我提出了这张票: https://jira.mongodb.org/browse/KAFKA-121得到以下响应:源连接器只会产生一个任务。 This is by design as the source connector is backed by a change stream.这是设计使然,因为源连接器由更改 stream 支持。 Change streams internally use the same data as used by replication engine and as such should be able to scale as the database does.变更流在内部使用与复制引擎相同的数据,因此应该能够像数据库一样进行扩展。 There are no plans to allow multiple cursors, however, should you feel that this is not meeting your requirements, then you can configure multiple connectors and each would have its own change stream cursor.没有计划允许多个游标,但是,如果您觉得这不符合您的要求,那么您可以配置多个连接器,每个连接器都有自己的更改 stream cursor。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM