[英]Dynamic creation of Kafka Connectors
I have deployed a Kafka cluster and a Kafka Connect cluster in kubernetes, using Strimzi and AKS.我已经使用 Strimzi 和 AKS 在 kubernetes 中部署了一个 Kafka 集群和一个 Kafka Connect 集群。 And I wanted to start reading from RSS resources to feed my Kafka cluster, so I created a connector instance of "org.kaliy.kafka.connect.rss.RssSourceConnector" which reads from a specific RSS feed, given an url, and writes to a specific topic.
我想开始从 RSS 资源中读取数据来为我的 Kafka 集群提供数据,所以我创建了一个“org.kaliy.kafka.connect.rss.RssSourceConnector”的连接器实例,它从特定的 RSS 源读取,给定一个 url,然后写入一个特定的主题。 But my whole intention with this is to eventually have a Kafka Connect cluster able to manage a lot of external requests of new RSSs to read from;
但我的全部意图是最终让 Kafka Connect 集群能够管理大量新 RSS 的外部请求以供读取; and here is where all my doubts come in:
这就是我所有疑虑的来源:
Right now, I think my best option is to rely on Kafka Connect REST API making the external client responsible of managing the state of the set of connectors, but I don't know if these was designed to recieve a lot of requests as it would be the case.现在,我认为我最好的选择是依靠 Kafka Connect REST API 让外部客户端负责管理连接器集的状态,但我不知道这些连接器是否设计用于接收大量请求是这样的。 Maybe these could be scaled by provisioning several listeners in the Kafka Connect REST API configuration but I do not know.
也许可以通过在 Kafka Connect REST API 配置中配置多个侦听器来扩展这些,但我不知道。 Thanks a lot!
非常感谢!
One of the main benefits in using Kafka Connect is to leverage a configuration-driven approach, so you will lose this by implementing your own Connector.使用 Kafka Connect 的主要好处之一是利用配置驱动的方法,因此您将通过实现自己的连接器而失去这一点。 In my opinion, the best strategy is to have one Connector instance for each RSS feed.
在我看来,最好的策略是为每个 RSS 提要拥有一个连接器实例。 Reducing the number of instances could make sense when having a single data source system, to avoid overloading it.
当拥有单一数据源系统时,减少实例数量可能是有意义的,以避免过载。
Using Strimzi Operator , Kafka Connect cluster would be monitored and it will try to restore the desired cluster state when needed.使用Strimzi Operator ,将监控 Kafka Connect 集群,并在需要时尝试恢复所需的集群 state。 This does not include the single Connector instances and their tasks, but you may leverage the K8s API for monitoring the Connector custom resource (CR) status, instead of the REST API.
这不包括单个连接器实例及其任务,但您可以利用 K8s API 来监控连接器自定义资源 (CR) 状态,而不是 REST API。
Example:例子:
$ kubetctl get kafkaconnector amq-sink -o yaml
apiVersion: kafka.strimzi.io/v1alpha1
kind: KafkaConnector
# ...
status:
conditions:
- lastTransitionTime: "2020-12-07T10:30:28.349Z"
status: "True"
type: Ready
connectorStatus:
connector:
state: RUNNING
worker_id: 10.116.0.66:8083
name: amq-sink
tasks:
- id: 0
state: RUNNING
worker_id: 10.116.0.66:8083
type: sink
observedGeneration: 1
It could be late, but it could help anyone will pass by the question, It is more relevant to have a look at Kafka-connect CR (Custom Resources) as a part of Confluent For Kubernetes (CFK), it introduces a clear cut declarative way to manage and monitor Connector with health checks and auto healing.它可能会迟到,但它可以帮助任何人绕过这个问题,更相关的是看看 Kafka-connect CR(自定义资源)作为 Confluent For Kubernetes(CFK)的一部分,它引入了一个明确的声明通过健康检查和自动修复来管理和监控连接器的方法。
https://www.confluent.io/blog/declarative-connectors-with-confluent-for-kubernetes/ https://www.confluent.io/blog/declarative-connectors-with-confluent-for-kubernetes/
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.