简体   繁体   English

卡夫卡主题与分区

[英]Kafka topic with Partitions

Simple question: 简单的问题:

Let's assume I have a topic with 3 partitions: Topic: StateEvents P1, P2 and P3. 假设我有一个包含3个分区的主题:主题:StateEvents P1,P2和P3。

Let's also assume that the producer generates 20 messages: 我们还假设生产者生成20条消息:

1, 2, 3, ..........20 1,2,3,.......... 20

my question is: 我的问题是:

When the producer produces these messages: 生产者产生以下消息时:

1) Will each message be in only and only 1 partition? 1)每个消息都只能在一个分区中吗? that is, 1 in P1, 2 in P2, 3 in P3, then 4 in P1, 5 in P2, 6 in P3 and so on and so forth? 也就是说,P1中为1,P2中为2,P3中为3,然后P1中为4,P2中为5,P3中为6,依此类推?

2) If # 1 is true, when a consumer subscribes, it would be subscribing to ALL partitions so that it gets all messages? 2)如果#1为true,那么当使用者订阅时,它将订阅所有分区,以便获取所有消息?

Thanks 谢谢

  1. Yes, each message is written in only one partition. 是的,每条消息仅写入一个分区中。

  2. When single consumer subscribes to the kafka topic(having multiple partitions), it will read messages from all partitions. 当单个使用者订阅kafka主题(具有多个分区)时,它将从所有分区读取消息。 But if you run multiple consumers with the same consumer group.id, each consumer will read from different-different partitions. 但是,如果您使用同一个使用者group.id运行多个使用者,则每个使用者将从不同的分区中读取。

Let's say, a kafka topic has 3 partitions, and you have 3 consumers with same group.id. 假设一个kafka主题有3个分区,而您有3个使用者具有相同的group.id。 Each consumer will read from 1-1 partitions. 每个使用者将读取1-1个分区。 But if there is only one consumer, it will read from all 3 partitions. 但是,如果只有一个使用者,它将从所有3个分区中读取。

1) The destination partition is determined by the producer. 1)目标分区由生产者确定。 With the default partitioner algorithm (which can be customized) the destination partition is : hash(message-key) % num_partitions. 使用默认的分区程序算法(可以自定义),目标分区是:hash(message-key)%num_partitions。 It means that all messages with same key goes into the same partition. 这意味着所有具有相同密钥的消息都将进入同一分区。 So if you are using a key and all messages have the same key, then they are going to the same partition. 因此,如果您使用密钥,并且所有消息都具有相同的密钥,那么它们将进入相同的分区。 If key is not specified a round robin is used. 如果未指定密钥,则使用轮询。 In any case a message goes always to only ONE partition. 无论如何,一条消息总是仅到达一个分区。

2) if the consumer is the only one in the consumer group, it will get all the partitions. 2)如果使用者是使用者组中唯一的使用者,它将获得所有分区。 You can add more consumers in the same consumer group for sharing the load (in your case up to 3 consumers, which is the number of partitions you have in the topic, so that one consumer gets one partition). 您可以在同一使用者组中添加更多使用者,以分担负载(在您的情况下,最多3个使用者,这是您在主题中拥有的分区数,以便一个使用者获得一个分区)。

Having different consumers getting messages from different partitions is the way how Kafka scale really well. 让不同的消费者从不同的分区获取消息是Kafka如何真正扩展的方式。 It's not a drawback because you have to think in terms of consuming application (made by multiple consumers). 这不是缺点,因为您必须考虑使用应用程序(由多个使用者组成)的情况。 The application identifier can be the group-id used by all its consumers: you have the application getting ALL the messages from the topic but the load is spread across its consumers. 应用程序标识符可以是其所有使用者使用的组ID:您让应用程序从该主题获取所有消息,但负载分散在其使用者上。

Each message will be sent to only one partition 每条消息将仅发送到一个分区

If key is not null, partition ID calculation is implemented according to the partition method implemented in the configuration. 如果key不为null,则根据配置中实现的分区方法执行分区ID计算。 here is the source code 这是源代码

class DefaultPartitioner(props: VerifiableProperties = null) extends Partitioner {
  private val random = new java.util.Random

  def partition(key: Any, numPartitions: Int): Int = {
    Utils.abs(key.hashCode) % numPartitions
  }
}

To ensure the same type of message sequentiality (FIFO), a partition can only be consumed by one consumer of the same group, and consumers of different groups can bind the same partition for repeated consumption. 为了确保相同类型的消息顺序(FIFO),一个分区只能由同一组的一个使用者使用,并且不同组的使用者可以绑定同一分区以重复使用。 But a consumer can cosume more than one partition. 但是,消费者可以使用多个分区。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM