簡體   English   中英

如何從kafka主題獲取所有消息並使用Java進行計數?

[英]How to get all the messages from kafka topic and count them using java?

這段代碼有時會給我從頭開始的所有消息,並等待另一條消息,有時它只是在等待另一條消息

import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Properties;
import kafka.consumer.ConsumerConfig;
import kafka.consumer.ConsumerIterator;
import kafka.consumer.KafkaStream;
import kafka.javaapi.consumer.ConsumerConnector;
import kafka.message.MessageAndMetadata;

public class TestConsumer{

public static void main(String[] args) {
    ConsumerConfig config;
    Properties props = new Properties(); 
    props.put("zookeeper.connect","sandbox.hortonworks.com:2181");
    props.put("group.id", "group-4");
    props.put("zookeeper.session.timeout.ms", "400");
    props.put("zookeeper.sync.time.ms", "200");
    props.put("auto.commit.interval.ms", "200");
    config = new ConsumerConfig(props);
    ConsumerConnector consumer = kafka.consumer.Consumer.createJavaConsumerConnector
            (config);
    String topic = "News"; 
    System.out.println("Running");
    Run(consumer,topic); 
}

public static void Run(ConsumerConnector consumer,String topic){
    HashMap<String,Integer> topicCountMap = 
            new HashMap<String,Integer>();
    topicCountMap.put(topic, 1);
    Map<String,List<KafkaStream<byte[],byte[]>>> 
    consumerMap = consumer.createMessageStreams(topicCountMap);
    KafkaStream<byte[],byte[]> stream = consumerMap.get(topic).get(0);
    ConsumerIterator<byte[],byte[]> it =  stream.iterator();
    List<String> msgTopicList = new ArrayList<String>();
    int count = 0;
    System.out.println("Waiting");
    while(it.hasNext()){
        MessageAndMetadata<byte[],byte[]> msgAndData = it.next(); 
        String msg = new String(msgAndData.message());
        msgTopicList.add(msg);
        String key = "NoKey";
        System.out.println(msg);
        count++;
    }
}
}

我要做的是將主題中的所有消息發送給用戶並進行計數

做這個的最好方式是什么?

版本kafka_2.10-0.8.1.2.2.4.2-2

這是你的例子。

這里最重要的是Kafka使用者配置屬性:

從隊列的開頭開始。

props.put("auto.offset.reset", "smallest");  

不會為該消費者存儲偏移量。

props.put("auto.commit.enable", "false");

如果沒有更多可用消息,則將等待5秒鍾以接收消息。

props.put("consumer.timeout.ms", "5000");

整個例子:

package com.xxx.yyy.zzz;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Properties;

import kafka.consumer.Consumer;
import kafka.consumer.ConsumerConfig;
import kafka.consumer.ConsumerTimeoutException;
import kafka.consumer.KafkaStream;
import kafka.javaapi.consumer.ConsumerConnector;

public class KafkaConsumer {
    private final ConsumerConnector consumer;
    private final String topic;
    private int count = 0;

    public KafkaConsumer(final String zookeeper, final String groupId, final String topic) {
        this.consumer = Consumer.createJavaConsumerConnector(createConsumerConfig(zookeeper, groupId));
        this.topic = topic;
    }

    // Initialize connection properties to Kafka and Zookeeper
    private static ConsumerConfig createConsumerConfig(final String zookeeper, final String groupId) {
        Properties props = new Properties();
        props.put("zookeeper.connect", zookeeper);
        props.put("group.id", groupId);
        props.put("zookeeper.session.timeout.ms", "2000");
        props.put("zookeeper.sync.time.ms", "250");
        props.put("auto.commit.interval.ms", "1000");
        props.put("auto.offset.reset", "smallest");
        props.put("auto.commit.enable", "false");
        props.put("consumer.timeout.ms", "5000");

        return new ConsumerConfig(props);
    }

    private void getData() {
        List<byte[]> msgs = new ArrayList();
        Map<String, Integer> topicMap = new HashMap<>();

        // Define single thread for topic
        topicMap.put(topic, 1);
        try {
            Map<String, List<KafkaStream<byte[], byte[]>>> listMap = consumer.createMessageStreams(topicMap);
            List<KafkaStream<byte[], byte[]>> kafkaStreams = listMap.get(topic);

            // Collect the messages.
            kafkaStreams.forEach(ks -> ks.forEach(mam -> msgs.add(mam.message())));

        } catch (ConsumerTimeoutException exception) {
            // There no more messages available -> so, we are done.

            // Now print all your messages
            msgs.forEach(System.out::println);

            // count them
            count = msgs.size();
        } finally {
            if (consumer != null) {
                consumer.shutdown();
            }
        }
    }
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM