I have a kafka consumer which is consuming multiple topics (30+) & 6 partitions for each topic. I would like to learn how a single consumer consumes from multiple topics ( & partition) and how does it schedules which topic,partititon,offset to consume ?
I am facing consumer lag issues and wanted to learn more about how consumer consumes from multiple topics.
Will it start multiple threads ?
Will it schedule itself between partitions ?
What kind of scheduling it will use
My question is related to single consumer consuming from multiple topics. Let's say all the topics are loaded with 1M records each and a single consumer has to process those records. In what order will it read from topics ( i mean which topic/partition first, etc)
Any links to kafka internals will help ?
Will it start multiple threads ?
For Java consumer API, No. Only one thread (excluding the heartbeat thread) is created to fetch the records.
Will it schedule itself between partitions ?
The fetcher batches by topic partitions. Say you have three topics: t1, t2 and t3, each of which has two partitions. It may end up something like t3-1, t3-0, t2-0, t2-1, t1-0, t1-1.
What kind of scheduling it will use
Basically, it uses a round-robin policy to ensure fairness.
Seems no internal links expose them. See SubscriptionState and PartitionStates for details.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.