简体   繁体   English

Apache Kafka-仅在特定分区中使用的使用者之间的负载平衡

[英]Apache Kafka - Load-Balancing between Consumers consuming only from a specific partition

I understand that in Apache Kafka I can write Producer and Partitioner in such a way that messages of TypeA goes to PartitionA and messages of TypeB goes to PartitionB. 我了解在Apache Kafka中,我可以编写Producer和Partitioner的方式是TypeA的消息进入PartitionA,TypeB的消息进入PartitionB。 And I can also write a Consumer/ConsumerGroup in such a way that Consumer/ConsumerGroupA consumes only from PartitionA and Consumer/ConsumerGroupB consumes only from PartitionB using assign(). 而且,我还可以编写一个Consumer / ConsumerGroup,使Consumer / ConsumerGroupA仅从PartitionA消费,而Consumer / ConsumerGroupB仅从PartitionB消费(使用assign())。

But what I really want to understand is, is it a good practice at all. 但是,我真正想了解的是,这是一种很好的做法。 Because, from what I understand, this would severely restrict my load-balancing capability and increase complexity at the same time. 因为据我所知,这将严重限制我的负载平衡能力并同时增加复杂性。 The reason being, if my messages of TypeA increase and I want to create another Partition to handle the load, say PartitionA2, and I create more Consumer, add both the new and the old Consumer to a ConsumerGroup and make sure they collectively process items from both the new and the old partitions, will I be able to do it? 原因是,如果我的TypeA消息增加了,并且我想创建另一个分区来处理负载(例如PartitionA2),并且我创建了更多的Consumer,请将新的和旧的Consumer都添加到ConsumerGroup中,并确保它们共同处理来自新分区和旧分区都可以吗?

Using assign doesn't restrict your load-balancing capabilities but just put in your hands all the problems for reassigning partitions when a new consumer comes up or goes down. 使用assign不会限制您的负载平衡功能,而只是将新使用者使用或关闭时重新分配分区的所有问题交到您手中。 It's something that you have for free with the subscribe way. 订阅方式是免费的。 Regarding your specific question, when you add a PartitionA2 for sure you can add another consumer which uses assign for being assigned to such partition. 关于您的特定问题,请确保在添加PartitionA2时可以添加另一个使用分配给该分区的使用者。

You can use subscribe API to add more consumer instances to a consumer group. 您可以使用订阅API将更多使用者实例添加到使用者组。 With assign API you have to handle rebalancing yourself. 使用Assign API,您必须自己进行重新平衡。 Also if your application depends on partitioning strategy(on event ordering) you may not want to change the partitioning. 同样,如果您的应用程序依赖于分区策略(根据事件顺序),则可能不想更改分区。 For example let's say you have one partition for user login/logout actions. 例如,假设您有一个分区用于用户登录/注销操作。 Now if you change the partitioning to have 2 partitions( one for login and logout) your application can see logout event before a login event for that particular user. 现在,如果将分区更改为具有2个分区(一个用于登录和注销),则您的应用程序可以在该特定用户的登录事件之前看到注销事件。 Of course you need to see what is typeA and if it is okay to send the typeA events to 2 different partitions. 当然,您需要查看什么是typeA,以及是否可以将typeA事件发送到2个不同的分区。

感谢ppatierno&mrnakumar的帮助,尽管我的TypeA事件都将是独立的,并且不需要时间排序,但是我必须处理自己的平衡这一事实无疑是令人沮丧的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM