简体   繁体   中英

Multiple Kafka Stream vs One Stream consuming multiple topics

Which one of the following is best practice for Production environment:

1: One stream consuming from multiple topics and writing to multiple topics.

2: Creating multiple streams (each with different app.id) for consuming from the different topic and writing to a different topic.

I am not sure about 1st approach because when the amount of data in all topics will increase, won't consumer lag?

On what factor should I decide, which of the above approach is best suited for my scenario?

Update 1: I have 2 Topics. 1st topic with 1 partition(because I need to maintain ordering). 2nd topic with 6 partitions.

It depends very much on your use case scenario(eg what sort of business logic does the consumers, how are they being deployed: standalone apps, clusters, etc). Your question is more on the architecture side. Both solutions are viable, particularities are in your specific use case.

If you semantically split your business logic into different stream I would suggest to go with the second option.

Regarding the amount of data, keep in mind that most Kafka consumers should benefit from back pressure mechanism, so they will process how much they consume.

I always suggest you go with option 2 because using option 2 we can also achieve fault tolerant, ie if one your application instance went down the stream partition handled by that instance will be distributed to the other running instances. If you want to use the parallelism then you should use the same app.id for all the stream processing instances.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM