动态创建 Kafka 连接器

Question

I have deployed a Kafka cluster and a Kafka Connect cluster in kubernetes, using Strimzi and AKS.我已经使用 Strimzi 和 AKS 在 kubernetes 中部署了一个 Kafka 集群和一个 Kafka Connect 集群。 And I wanted to start reading from RSS resources to feed my Kafka cluster, so I created a connector instance of "org.kaliy.kafka.connect.rss.RssSourceConnector" which reads from a specific RSS feed, given an url, and writes to a specific topic.我想开始从 RSS 资源中读取数据来为我的 Kafka 集群提供数据，所以我创建了一个“org.kaliy.kafka.connect.rss.RssSourceConnector”的连接器实例，它从特定的 RSS 源读取，给定一个 url，然后写入一个特定的主题。 But my whole intention with this is to eventually have a Kafka Connect cluster able to manage a lot of external requests of new RSSs to read from;但我的全部意图是最终让 Kafka Connect 集群能够管理大量新 RSS 的外部请求以供读取； and here is where all my doubts come in:这就是我所有疑虑的来源：

Shoud I create an instance of Kaliy RSS connector for each RSS feed?我应该为每个 RSS 提要创建一个 Kaliy RSS 连接器实例吗？ Or would it be better to implement my own connector, so I create only one instance of it and each time I want to read a new RSS feed I would create a new Task in the connector?或者实现我自己的连接器会更好，所以我只创建它的一个实例，每次我想阅读一个新的 RSS 提要时，我都会在连接器中创建一个新任务？
Who should be resposible of assuring the Kafka Connect Cluster state is the desired one?谁应该负责确保 Kafka Connect Cluster 状态是所需状态？ I mean that if a Connector(in the case of 1 RSS feed: 1 Connector instance) stopped working, who should try to start it again?我的意思是，如果连接器（在 1 个 RSS 提要的情况下：1 个连接器实例）停止工作，谁应该尝试重新启动它？ An external client via the Kafka Connect REST API?通过 Kafka Connect REST API 的外部客户端？ Kubernetes itself? Kubernetes 本身？

Right now, I think my best option is to rely on Kafka Connect REST API making the external client responsible of managing the state of the set of connectors, but I don't know if these was designed to recieve a lot of requests as it would be the case.现在，我认为我最好的选择是依靠 Kafka Connect REST API 让外部客户端负责管理连接器集的状态，但我不知道这些连接器是否设计用于接收大量请求是这样的。 Maybe these could be scaled by provisioning several listeners in the Kafka Connect REST API configuration but I do not know.也许可以通过在 Kafka Connect REST API 配置中配置多个侦听器来扩展这些，但我不知道。 Thanks a lot!非常感谢！

Answer 1

One of the main benefits in using Kafka Connect is to leverage a configuration-driven approach, so you will lose this by implementing your own Connector.使用 Kafka Connect 的主要好处之一是利用配置驱动的方法，因此您将通过实现自己的连接器而失去这一点。 In my opinion, the best strategy is to have one Connector instance for each RSS feed.在我看来，最好的策略是为每个 RSS 提要拥有一个连接器实例。 Reducing the number of instances could make sense when having a single data source system, to avoid overloading it.当拥有单一数据源系统时，减少实例数量可能是有意义的，以避免过载。

Using Strimzi Operator , Kafka Connect cluster would be monitored and it will try to restore the desired cluster state when needed.使用Strimzi Operator ，将监控 Kafka Connect 集群，并在需要时尝试恢复所需的集群 state。 This does not include the single Connector instances and their tasks, but you may leverage the K8s API for monitoring the Connector custom resource (CR) status, instead of the REST API.这不包括单个连接器实例及其任务，但您可以利用 K8s API 来监控连接器自定义资源 (CR) 状态，而不是 REST API。

Example:例子：

$ kubetctl get kafkaconnector amq-sink -o yaml
apiVersion: kafka.strimzi.io/v1alpha1
kind: KafkaConnector
# ...
status:
  conditions:
  - lastTransitionTime: "2020-12-07T10:30:28.349Z"
    status: "True"
    type: Ready
  connectorStatus:
    connector:
      state: RUNNING
      worker_id: 10.116.0.66:8083
    name: amq-sink
    tasks:
    - id: 0
      state: RUNNING
      worker_id: 10.116.0.66:8083
    type: sink
  observedGeneration: 1

Answer 2

It could be late, but it could help anyone will pass by the question, It is more relevant to have a look at Kafka-connect CR (Custom Resources) as a part of Confluent For Kubernetes (CFK), it introduces a clear cut declarative way to manage and monitor Connector with health checks and auto healing.它可能会迟到，但它可以帮助任何人绕过这个问题，更相关的是看看 Kafka-connect CR（自定义资源）作为 Confluent For Kubernetes（CFK）的一部分，它引入了一个明确的声明通过健康检查和自动修复来管理和监控连接器的方法。

https://www.confluent.io/blog/declarative-connectors-with-confluent-for-kubernetes/ https://www.confluent.io/blog/declarative-connectors-with-confluent-for-kubernetes/

动态创建 Kafka 连接器

问题描述

2 个解决方案

解决方案1
0 2020-12-07 10:40:03

解决方案2
0 2022-12-25 08:58:22

动态创建 Kafka 连接器

问题描述

2 个解决方案

解决方案1 0 2020-12-07 10:40:03

解决方案2 0 2022-12-25 08:58:22

解决方案1
0 2020-12-07 10:40:03

解决方案2
0 2022-12-25 08:58:22