简体繁体中英

Partitions processing stuck until state store is rebuilt during rebalancing in Kafka Streams

原文 2021-11-03 14:58:51 8 2 apache-kafka/ kafka-consumer-api/ apache-kafka-streams/ rebalancing

Let's assume I have stateful Kafka Streams application that consumes data from topic with 3 partitions. At the moment I have 2 instances of the above application running. Let's put it like that: instance1 have partitions part1 and part2 assigned, instance2 has part3 .

So now I want to add the new instance to utilize the parallelization completely.

In my understanding, as soon as I start a new instance, the rebalancing occurs: one of partitions part1 or part2 and corresponding local state stores will be migrated from the existing instance to the newly added instance. In this example, let's imagine that part1 migrates on instance3 .

At the same time, I realize that new instance instance3 will not start processing new data until it restores the local state store from the changelog topic, which may take much time.

During the period from starting the application and until it restores the state store:

does it mean that the data from part1 is not being processed and stuck until instance3 finishes the start up?
if yes, then what are the approaches to estimate how much time will it take for instance3 to build the local state store?
during this time, are other instances not affected by rebalancing and keep processing data with no downtime ( instance1 - part2 , instance2 - part3 )?

2 answers

The rebalance upon adding of the new instance is at consumer group level. What this means is that all the partitions assigned to all the consumers of the consumer group would be revoked and then re-distributed. So all the partitions - part1, part2 and part3 would be stuck till the rebalancing is complete.

Now to estimate the downtime is a bit tricky. You could emit events on rebalance trigger and consumption start - then compute the time difference between the two events to get an estimate of downtime. If you have a simple java consumer logs, you can also get a rough estimate as all relevant logs (partitions revoked as well as partitions assigned) are already present.

Rebalancing has evolved with the recent releases:

from version 2.4.0 with KIP-429

the incremental cooperative rebalancing is added that came instead of the stop-the-world rebalancing protocol
optimized for cloud in sense of better rebalance behavior for falling out members (eg when Pod is dead and restarts)
consumer does not need to revoke a partition if the group coordinator reassigns the same partition to the consumer again

=> part2 and part3 are not stuck and continued to be processed

from version 2.6.0 with KIP-441

improve Kafka Streams scaling out behavior, especially for stateful tasks
previously some tasks have been blocked in processing until the state store is rebuilt which may take hours
now the new instance first tries to catch-up the state store from change log and only then takes the task as active
no downtime during the scale out

=> part1 continues to be processed on instance1 until instance3 rebuilds the state store for part1 and ready to hand over of its processing

number of partitions for global state store in kafka streams

Kafka Consumer stuck in rebalancing state

Kafka streams: State store is not initialised during left join

Cost of Rebalancing partitions of a topic in Kafka

Kafka Streams Binding: Cannot get state store because the stream thread is PARTITIONS_ASSIGNED, not RUNNING

Kafka Streams application Endless rebalancing

Kafka rebalancing - assignement of Kafka consumers to partitions

Kafka streams state store distribution

Kafka streams state store for what?

Kafka local state store of multiple partitions

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question number of partitions for global state store in kafka streams Kafka Consumer stuck in rebalancing state Kafka streams: State store is not initialised during left join Cost of Rebalancing partitions of a topic in Kafka Kafka Streams Binding: Cannot get state store because the stream thread is PARTITIONS_ASSIGNED, not RUNNING Kafka Streams application Endless rebalancing Kafka rebalancing - assignement of Kafka consumers to partitions Kafka streams state store distribution Kafka streams state store for what? Kafka local state store of multiple partitions

Related Tags

Partitions processing stuck until state store is rebuilt during rebalancing in Kafka Streams

Question

2 answers

solution1
0 2021-11-08 16:10:16

solution2
0 2021-11-25 09:38:19

Partitions processing stuck until state store is rebuilt during rebalancing in Kafka Streams

Question

2 answers

solution1 0 2021-11-08 16:10:16

solution2 0 2021-11-25 09:38:19

solution1
0 2021-11-08 16:10:16

solution2
0 2021-11-25 09:38:19