简体   繁体   中英

Concurrently Consume Multiple topics as a kafka consumer

I am following an example presented in this url https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example regarding concurrent consumption of kafka topics.

In the creating the thread pool section , they have the following code

public void run(int a_numThreads) {
    Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
    topicCountMap.put(topic, new Integer(a_numThreads));
    Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer.createMessageStreams(topicCountMap);
    List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(topic);


    // now launch all the threads
    //
    executor = Executors.newFixedThreadPool(a_numThreads);

    // now create an object to consume the messages
    //
    int threadNumber = 0;
    for (final KafkaStream stream : streams) {
        executor.submit(new ConsumerTest(stream, threadNumber));
        threadNumber++;
    }
}

I can add more topics to topicCountMap. For example,

topicCountMap.put("channel1", new Integer(a_numThreads));
topicCountMap.put("channe2", new Integer(a_numThreads));
topicCountMap.put("channel3", new Integer(a_numThreads));

In the above code, it seems to me that the streams object only maps to one of the topics

List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(topic);

I am not entirely sure on how to create multiple stream objects, each mapped to given topic and then iterate through them to fetch data from each of the channels and have them submit to executor.

suppose you have :

String topic1 = "channel1";
String topic2 = "channel2";
String topic3 = "channel3";

Then, indeed you can do :

topicCountMap.put(topic1, new Integer(a_numThreads_topic1));
topicCountMap.put(topic2, new Integer(a_numThreads_topic2));
topicCountMap.put(topic3, new Integer(a_numThreads_topic3));

Once you get your consumerMap (The code that does that doesn't change), you will be able to retrieve the streams for each topic :

List<KafkaStream<byte[], byte[]>> topic1_streams = consumerMap.get(topic1);
List<KafkaStream<byte[], byte[]>> topic2_streams = consumerMap.get(topic2);
List<KafkaStream<byte[], byte[]>> topic3_streams = consumerMap.get(topic3);

To consume from the streams, you need to create the right number of executors :

executors_topic1 = Executors.newFixedThreadPool(a_numThreads_topic1);
executors_topic2 = Executors.newFixedThreadPool(a_numThreads_topic2);
executors_topic3 = Executors.newFixedThreadPool(a_numThreads_topic3);

Finally :

int threadNumber = 0;
for (final KafkaStream stream : topic1_streams) {
    executors_topic1.submit(new ConsumerTest(streams, threadNumber));
    threadNumber++;
}
for (final KafkaStream stream : topic2_streams) {
    executors_topic2.submit(new ConsumerTest(stream, threadNumber));
    threadNumber++;
}
for (final KafkaStream stream : topic3_streams) {
    executor_topic3.submit(new ConsumerTest(stream, threadNumber));
    threadNumber++;
}

Of course, it's just to give you the idea. Obviously, the code can be improved.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM