简体   繁体   English

Kafka stream 如何在 Flink 中的 TaskManager 之间分配?

[英]How does Kafka stream get distributed among TaskManagers in Flink?

Say a Flink Job (three task managers tm1 , tm2 & tm3 ) consumes Kafka topic as a source, how does the stream gets distributed among them?假设一个 Flink 作业(三个任务管理器tm1tm2tm3 )使用 Kafka 主题作为源,那么 stream 是如何在它们之间分配的? Who does the distribution?谁做分配?

This is done in FlinkKafkaConsumerBase , in its open() method.这是在FlinkKafkaConsumerBaseopen()方法中完成的。 The Flink runtime context provides methods that each instance can use to determine the total number of parallel instances of the Flink Kafka consumer, as well as the index of a specific instance. Flink 运行时上下文提供了方法,每个实例可以使用这些方法来确定 Flink Kafka 消费者的并行实例总数,以及特定实例的索引。 Each instance uses these methods to independently take responsibility for reading from specific partitions.每个实例都使用这些方法独立地负责读取特定分区。

Adding to what David wrote you should keep one thing in mind: The max.除了大卫写的内容之外,您还应该记住一件事:最大值。 parallism of a KafkaProducer is limited by the number of partitions. KafkaProducer 的并行性受分区数量的限制。 Since Flink will start distributing the tasks starting with the first slot (the first task-manager) and then go on with the 2nd and so on and repeat this for each source, you might see an unbalanced workload if you have more task-managers than topic-partitions.由于 Flink 将从第一个插槽(第一个任务管理器)开始分配任务,然后 go 在第二个等上,并为每个源重复此操作,如果您的任务管理器数量超过主题分区。

In a scenario where you have many kafka-sources with a small number of topic-partitions this imbalance becomes more and more visible.在您有许多 kafka 源和少量主题分区的情况下,这种不平衡变得越来越明显。 In an extrem case you have many sources with only one partition all this sources will get consumed by the first slot/task-manager.在极端情况下,您有许多只有一个分区的源,所有这些源都将被第一个插槽/任务管理器消耗。 You can work around this edge case if you use Slot sharing groups .如果您使用Slot 共享组,您可以解决这种极端情况。 This is of course an edge case but it might be good to have this in your mind when you define your resources and workflows.这当然是一个边缘案例,但在定义资源和工作流程时,最好记住这一点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM