简体   繁体   中英

Why kafka 0.8.2 say that each partition is consumed by exactly one consumer in a consumer group

In Apache Kafka 0.8.2 office document, section 5.6 Distribution , Consumers and Consumer Groups subsection, it says that

The consumers in a group divide up the partitions as fairly as possible, each partition is consumed by exactly one consumer in a consumer group.

But I have found that in practice, it is possible that multiple consumers in a consumer group can consuming data from a single partition by sending FetchRequest from the same topic-partition.

And in the followed Consumer Id Registry subsection

In addition to the group_id which is shared by all consumers in a group, each consumer is given a transient, unique consumer_id (of the form hostname:uuid) for identification purposes. Consumer ids are registered in the following directory.

/consumers/[group_id]/ids/[consumer_id] --> {"topic1": #streams, ..., "topicN": #streams} (ephemeral node)

It says there is a unique id for each consumer. However, I could not found such structure in zookeeper.

I do not know when consumer start to register? The client library I used is kakfa-python 0.9.4.

May this help

(1) For your second question. https://github.com/dpkp/kafka-python/issues/472 And issue38

It said "Coordinated Consumer Group support is under development."

(2) For your first question.

It said "This is achieved by assigning the partitions in the topic to the consumers in the consumer group so that each partition is consumed by exactly one consumer in the group. "(statement A). This depends on clients implements. This may be not right in some kafka clients. I just have experience in python and cpp. If group was implemented, each message is consumed by exactly one consumer in the group. How to assign partitions between consumers in one group is different. When there are more partitions than consumers, Statement A may be right. But it is also possible that the partitions may be re-assigned when new partitions join or leave the existing group. In this case, partition A may be consumed by consumer A firstly and then consumed by consumer B, which is possible. In some clients, you can choose the assignment algorithms, such as round-robin, and so on.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM