简体   繁体   English

Kafka 消费者从多个主题中读取

[英]Kafka Consumer to read from multiple topics

I am very new to Kafka.我对卡夫卡很陌生。 I am creating two topics and publishing on these two topics from two Producers.我正在创建两个主题并从两个生产者发布这两个主题。 I have one consumer which consumes the messages from both the topics.我有一个消费者,它使用来自这两个主题的消息。 This is because I want to process according to the priority.这是因为我想根据优先级进行处理。

I am getting a stream from both the topics but as soon as I start iterating on ConsumerItreator of any stream, it blocks there.我从这两个主题中都得到了一个流,但是一旦我开始迭代任何流的ConsumerItreator ,它就会阻塞在那里。 As it's written in documentation, it will be blocked till it gets a new message.正如文档中所写,它将被阻止,直到收到新消息。

Is any one aware of how to read from two topics and two streams from a single Kafka Consumer?有没有人知道如何从单个 Kafka 消费者的两个主题和两个流中读取数据?

    Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
                topicCountMap.put(KafkaConstants.HIGH_TEST_TOPIC, new Integer(1));
                topicCountMap.put(KafkaConstants.LOW_TEST_TOPIC, new Integer(1));
                Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumerConnector.createMessageStreams(topicCountMap);
                KafkaStream<byte[], byte[]> highPriorityStream = consumerMap.get(KafkaConstants.HIGH_TEST_TOPIC).get(0);
                ConsumerIterator<byte[], byte[]> highPrioerityIterator = highPriorityStream.iterator();

                while (highPriorityStream.nonEmpty() && highPrioerityIterator.hasNext())
                {
                    byte[] bytes = highPrioerityIterator.next().message();
                    Object obj = null;
                    CLoudDataObject thunderDataObject = null;
                    try
                    {

                        obj = SerializationUtils.deserialize(bytes);
                        if (obj instanceof CLoudDataObject)
                        {
                            thunderDataObject = (CLoudDataObject) obj;
                            System.out.println(thunderDataObject);
                            // TODO Got the Thunder object here, now write code to send it to Thunder service.
                        }

                    }
                    catch (Exception e)
                    {
                    }
                }

If you don't want to process lower priority messages before high priority ones, how about setting consumer.timeout.ms property and catch ConsumerTimeoutException to detect that the flows for high priority reach the last message available?如果您不想在高优先级消息之前处理低优先级消息,那么如何设置 consumer.timeout.ms 属性并捕获 ConsumerTimeoutException 以检测高优先级流是否到达最后一条可用消息? By default it's set -1 to block until a new message arrives.默认情况下,它设置为 -1 以阻止直到新消息到达。 ( http://kafka.apache.org/07/configuration.html ) ( http://kafka.apache.org/07/configuration.html )

The below explains a way to process multiple flows concurrently with different priorities.下面解释了一种同时处理具有不同优先级的多个流的方法。

Kafka requires multi-thread programming. Kafka 需要多线程编程。 In your case, the streams of the two topics need to be processed by threads for the flows.在您的情况下,两个主题的流需要由流的线程处理。 Because each thread will run independently to process messages, one blocking flow (thread) won't affect other flows.因为每个线程会独立运行来处理消息,一个阻塞流(线程)不会影响其他流。

Java's ThreadPool implementation can help the job in creating multi-thread application. Java 的 ThreadPool 实现可以帮助创建多线程应用程序。 You can find example implementation here:您可以在此处找到示例实现:

https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example

Regarding the priority of execution, you can call Thread.currentThread.setPriority method to have the proper priorities of threads based on their serving Kafka topic.关于执行的优先级,您可以调用 Thread.currentThread.setPriority 方法根据线程服务的 Kafka 主题来获得适当的线程优先级。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM