简体   繁体   English

java中的Kafka使用者不消费消息

[英]Kafka consumer in java not consuming messages

I am trying to a kafka consumer to get messages which are produced and posted to a topic in Java. 我正在尝试向kafka消费者获取生成并发布到Java主题的消息。 My consumer goes as follows. 我的消费者如下。

consumer.java consumer.java

import java.io.UnsupportedEncodingException;
import java.nio.ByteBuffer;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Properties;

import kafka.consumer.Consumer;
import kafka.consumer.ConsumerConfig;
import kafka.consumer.ConsumerIterator;
import kafka.consumer.KafkaStream;
import kafka.javaapi.consumer.ConsumerConnector;
import kafka.javaapi.message.ByteBufferMessageSet;
import kafka.message.MessageAndOffset;



public class KafkaConsumer extends  Thread {
    final static String clientId = "SimpleConsumerDemoClient";
    final static String TOPIC = " AATest";
    ConsumerConnector consumerConnector;


    public static void main(String[] argv) throws UnsupportedEncodingException {
        KafkaConsumer KafkaConsumer = new KafkaConsumer();
        KafkaConsumer.start();
    }

    public KafkaConsumer(){
        Properties properties = new Properties();
        properties.put("zookeeper.connect","10.200.208.59:2181");
        properties.put("group.id","test-group");      
        ConsumerConfig consumerConfig = new ConsumerConfig(properties);
        consumerConnector = Consumer.createJavaConsumerConnector(consumerConfig);
    }

    @Override
    public void run() {
        Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
        topicCountMap.put(TOPIC, new Integer(1));
        Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumerConnector.createMessageStreams(topicCountMap);
        KafkaStream<byte[], byte[]> stream =  consumerMap.get(TOPIC).get(0);
        System.out.println(stream);
        ConsumerIterator<byte[], byte[]> it = stream.iterator();
        while(it.hasNext())
            System.out.println("from it");
            System.out.println(new String(it.next().message()));

    }

    private static void printMessages(ByteBufferMessageSet messageSet) throws UnsupportedEncodingException {
        for(MessageAndOffset messageAndOffset: messageSet) {
            ByteBuffer payload = messageAndOffset.message().payload();
            byte[] bytes = new byte[payload.limit()];
            payload.get(bytes);
            System.out.println(new String(bytes, "UTF-8"));
        }
    }
}

When I run the above code I am getting nothing in the console wheres the java producer program behind the screen is posting data continously under the 'AATest' topic. 当我运行上面的代码时,我在控制台中什么都没有,因为屏幕后面的java生产者程序在“AATest”主题下不断发布数据。 Also the in the zookeeper console I am getting the following lines when I try running the above consumer.java 在zookeeper控制台中,当我尝试运行上面的consumer.java时,我得到以下几行

[2015-04-30 15:57:31,284] INFO Accepted socket connection from /10.200.208.59:51780 (org.apache.zookeeper.
server.NIOServerCnxnFactory)
[2015-04-30 15:57:31,284] INFO Client attempting to establish new session at /10.200.208.59:51780 (org.apa
che.zookeeper.server.ZooKeeperServer)
[2015-04-30 15:57:31,315] INFO Established session 0x14d09cebce30007 with negotiated timeout 6000 for clie
nt /10.200.208.59:51780 (org.apache.zookeeper.server.ZooKeeperServer)

Also when I run a separate console-consumer pointing to the AATest topic, I am getting all the data produced by the producer to that topic. 此外,当我运行一个指向AATest主题的单独控制台消费者时,我将获取生产者生成的所有数据到该主题。

Both consumer and broker are in the same machine whereas the producer is in different machine. 消费者和经纪人都在同一台机器上,而生产者则在不同的机器上。 This actually resembles this question . 这实际上类似于这个问题 But going through it dint help me. 但是通过它来帮助我。 Please help me. 请帮我。

Different answer but it happened to be initial offset ( auto.offset.reset ) for a consumer in my case. 不同的答案,但它碰巧是我的情况下的消费者的初始偏移( auto.offset.reset )。 So, setting up auto.offset.reset=earliest fixed the problem in my scenario. 因此,设置auto.offset.reset=earliest修复了我的场景中的问题。 Its because I was publishing event first and then starting a consumer. 这是因为我首先发布事件然后开始消费者。

By default, consumer only consumes events published after it started because auto.offset.reset=latest by default. 默认情况下,使用者仅消耗启动后发布的事件,因为默认情况下auto.offset.reset=latest

eg. 例如。 consumer.properties

bootstrap.servers=localhost:9092
enable.auto.commit=true
auto.commit.interval.ms=1000
session.timeout.ms=30000
auto.offset.reset=earliest
key.deserializer=org.apache.kafka.common.serialization.StringDeserializer
value.deserializer=org.apache.kafka.common.serialization.StringDeserializer

Test 测试

class KafkaEventConsumerSpecs extends FunSuite {

  case class TestEvent(eventOffset: Long, hashValue: Long, created: Date, testField: String) extends BaseEvent

  test("given an event in the event-store, consumes an event") {

    EmbeddedKafka.start()

    //PRODUCE
    val event = TestEvent(0l, 0l, new Date(), "data")
    val config = new Properties() {
      {
        load(this.getClass.getResourceAsStream("/producer.properties"))
      }
    }
    val producer = new KafkaProducer[String, String](config)

    val persistedEvent = producer.send(new ProducerRecord(event.getClass.getSimpleName, event.toString))

    assert(persistedEvent.get().offset() == 0)
    assert(persistedEvent.get().checksum() != 0)

    //CONSUME
    val consumerConfig = new Properties() {
      {
        load(this.getClass.getResourceAsStream("/consumer.properties"))
        put("group.id", "consumers_testEventsGroup")
        put("client.id", "testEventConsumer")
      }
    }

    assert(consumerConfig.getProperty("group.id") == "consumers_testEventsGroup")

    val kafkaConsumer = new KafkaConsumer[String, String](consumerConfig)

    assert(kafkaConsumer.listTopics().asScala.map(_._1).toList == List("TestEvent"))

    kafkaConsumer.subscribe(Collections.singletonList("TestEvent"))

    val events = kafkaConsumer.poll(1000)
    assert(events.count() == 1)

    EmbeddedKafka.stop()
  }
}

But if consumer is started first and then published, the consumer should be able to consume the event without auto.offset.reset required to be set to earliest . 但是,如果首先启动消费者然后发布消费者,则消费者应该能够消费该事件而不需要将auto.offset.reset设置为earliest

References for kafka 0.10 参考kafka 0.10

https://kafka.apache.org/documentation/#consumerconfigs https://kafka.apache.org/documentation/#consumerconfigs

In our case, we solved our problem with the following steps: 在我们的例子中,我们通过以下步骤解决了我们的问题:

The first thing we found is that there is an config called 'retry' for KafkaProducer and its default value means 'No Retry'. 我们发现的第一件事是KafkaProducer有一个名为'retry'的配置,它的默认值意味着'No Retry'。 Also, send method of the KafkaProducer is async without calling the get method of the send method's result. 此外,KafkaProducer的send方法是异步的,无需调用send方法结果的get方法。 In this way, there is no guarantee to delivery produced messages to the corresponding broker without retry . 通过这种方式,无法保证在不重试的情况下将生成的消息传递给相应的代理。 So, you have to increase it a bit or can use idempotence or transactional mode of KafkaProducer. 所以,你必须增加一点,或者可以使用KafkaProducer的幂等或事务模式。

The second case is about the Kafka and Zookeeper version. 第二种情况是关于Kafka和Zookeeper版本。 We chose the 1.0.0 version of the Kafka and Zookeeper 3.4.4. 我们选择了1.0.0版本的Kafka和Zookeeper 3.4.4。 Especially, Kafka 1.0.0 had an issue about the connectivity with Zookeeper. 特别是,Kafka 1.0.0存在与Zookeeper连接的问题。 If Kafka loose its connection to the Zookeeper with an unexpected exception, it looses the leadership of the partitions which didn't synced yet . 如果Kafka在与意外的异常情况下断开与Zookeeper的连接,它将失去尚未同步的分区的领导权。 There is an bug topic about this issue : https://issues.apache.org/jira/browse/KAFKA-2729 After we found the corresponding logs at Kafka log which indicates same issue at topic above, we upgraded our Kafka broker version to the 1.1.0. 有关此问题的错误主题: https//issues.apache.org/jira/browse/KAFKA-2729在我们在Kafka日志中找到相应的日志后,我们将Kafka经纪人版本升级为1.1.0。

It is also important point to notice that small sized the partitions (like 100 or less), increases the throughput of the producer so if there is no enough consumer then the available consumer fall into the thread stuck on results with delayed messages(we measured delay with minutes, approximately 10-15 minutes). 同样重要的是要注意小尺寸的分区(如100或更少),增加了生产者的吞吐量,所以如果没有足够的消费者,那么可用的消费者就会陷入线索,导致延迟消息的结果(我们测量延迟)分钟,大约10-15分钟)。 So you need to balance and configure the partition size and thread counts of your application correctly according to your available resources. 因此,您需要根据可用资源正确平衡和配置应用程序的分区大小和线程数。

There might also be a case where kafka takes a long time to rebalance consumer groups when a new consumer is added to the same group id. 可能还存在这样的情况:当新的消费者添加到同一组ID时,kafka需要很长时间来重新平衡消费者组。 Check kafka logs to see if the group is rebalanced after starting your consumer 检查kafka日志以查看在启动您的使用者后是否重新平衡该组

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM