简体   繁体   English

Kafka分区的含义

[英]Kafka partitions meaning

When we decide about partitions, should we do that on per-topic base, or it is topic-wide decision? 当我们决定分区时,应该基于每个主题进行划分,还是整个主题范围内的决策? If T1 partitioned on 3 partitions, and T2 partitioned on 2 partitions, can they both be consumed by 1 consumer? 如果T1划分为3个分区,而T2划分为2个分区,那么它们都可以被1个使用者使用吗?

Or it is better make equal number of partitions if topics must be consumed by 1 consumer? 或者,如果主题必须由1个使用者使用,则最好使分区数相等?

I ask that because high level consumer can be created by passing topics and partition number. 我要求这样做是因为可以通过传递主题和分区号来创建高级消费者。

So I wonder should I pass to that constructor only topics with equal partition number? 所以我想知道我是否应该仅将具有相同分区号的主题传递给该构造函数?

When we create high level consumer, we pass not partition number, but intended number of consuming threads(streams). 当我们创建高级使用者时,我们不会传递分区号,而是传递预期的使用者线程(流)数。

The answer is yes, they can be consumed by 1 consumer. 答案是可以的,一位消费者可以食用。 (If that consumer subscribed to both topics) Consumer just opens N streams/intended number of consuming threads (you pass that as a parameter!). (如果该使用者同时订阅了两个主题),使用者仅打开N个流/预期数目的使用者线程(您可以将其作为参数传递!)。 If N < P(number of all partitions of all topics), then some streams will collect data of several partitions. 如果N <P(所有主题的所有分区数),则某些流将收集多个分区的数据。 If N > P, some streams will be in non-busy wait. 如果N> P,则某些流将处于非繁忙等待状态。

It is desirable to have P=N, but it is even better to have N > P , because tomorrow if new partitions appear - you will be ready for grater load. P = N是可取的,但是N> P甚至更好,因为明天如果出现新的分区-您将准备好进行更大的装载。

I've done a research on that and created a blog entry 我对此进行了研究,并创建了一个博客条目

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM