简体   繁体   中英

How does kafka schedule consumer consuming from multiple topics

I have a kafka consumer which is consuming multiple topics (30+) & 6 partitions for each topic. I would like to learn how a single consumer consumes from multiple topics ( & partition) and how does it schedules which topic,partititon,offset to consume ?

I am facing consumer lag issues and wanted to learn more about how consumer consumes from multiple topics.

Will it start multiple threads ? 
Will it schedule itself between partitions ?
What kind of scheduling it will use 

My question is related to single consumer consuming from multiple topics. Let's say all the topics are loaded with 1M records each and a single consumer has to process those records. In what order will it read from topics ( i mean which topic/partition first, etc)

Any links to kafka internals will help ?

Will it start multiple threads ?

For Java consumer API, No. Only one thread (excluding the heartbeat thread) is created to fetch the records.

Will it schedule itself between partitions ?

The fetcher batches by topic partitions. Say you have three topics: t1, t2 and t3, each of which has two partitions. It may end up something like t3-1, t3-0, t2-0, t2-1, t1-0, t1-1.

What kind of scheduling it will use

Basically, it uses a round-robin policy to ensure fairness.

Seems no internal links expose them. See SubscriptionState and PartitionStates for details.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM