[英]Why does co-partitioning of two Kstreams in kafka require same number of partitions for both the streams?
As the name "co-partition" indicates, you want to put data from different topic but same key to the same Kafka Streams application instance. 正如名称“ co-partition”所指示的那样,您要将来自不同主题但具有相同密钥的数据放入同一Kafka Streams应用程序实例。 If you don't have the same number of partitions, it's not possible to get this behavior.
如果您没有相同数量的分区,则无法获得此行为。
Assume you have topic A with 2 partitions and topic B with 3 partitions. 假设您的主题A具有2个分区,主题B具有3个分区。 Thus, it can happen that one record with key X is hashed to partitions A-0 and B-1 (ie, not same partition number).
因此,可能发生的情况是,具有键X的一条记录被哈希到分区A-0和B-1(即,不同的分区号)。 However, for a different key Y it might be hashed to A-0 but B-2.
但是,对于其他密钥Y,可能会将其哈希到A-0,但会哈希到B-2。
Only if the number of partitions is the same for both topics, records with same key end up in the same partitions (of different topics of course), and this allows to process A-0/B-0 and A-1/B-1 etc together. 仅当两个主题的分区数相同时,具有相同键的记录才最终位于相同的分区(当然是不同主题)中,这允许处理A-0 / B-0和A-1 / B- 1等在一起。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.