简体   繁体   English

Apache Kafka Java使用者未收到复制因子大于一的主题消息

[英]Apache Kafka Java consumer does not receive message for topic with replication factor more than one

I'm starting on Apache Kakfa with a simple Producer, Consumer app in Java. 我从Apache Kakfa开始,它带有一个简单的Java生产者,消费者应用程序。 I'm using kafka-clients version 0.10.0.1 and running it on a Mac. 我正在使用kafka-clients版本0.10.0.1并在Mac上运行它。

I created a topic named replicated_topic_partitioned with 3 partitions and with replication factor as 3. 我创建了一个名为replicated_topic_partitioned的主题,其中包含3个分区,复制因子为3。

I started the zookeeper at port 2181. I started three brokers with id 1, 2 and 3 on ports 9092, 9093 and 9094 respectively. 我从2181端口启动了Zookeeper。我分别在9092、9093和9094端口上启动了ID为1、2和3的三个代理。

Here's the output of the describe command 这是describe命令的输出

kafka_2.12-2.3.0/bin/kafka-topics.sh --describe --topic replicated_topic_partitioned --bootstrap-server localhost:9092    
Topic:replicated_topic_partitioned    PartitionCount:3    ReplicationFactor:3    Configs:segment.bytes=1073741824
     Topic: replicated_topic_partitioned    Partition: 0    Leader: 3    Replicas: 3,1,2    Isr: 3,1,2
     Topic: replicated_topic_partitioned    Partition: 1    Leader: 1    Replicas: 1,2,3    Isr: 1,2,3
     Topic: replicated_topic_partitioned    Partition: 2    Leader: 2    Replicas: 2,3,1    Isr: 2,3,1

I wrote a simple producer and a consumer code. 我写了一个简单的生产者和消费者代码。 The producer ran successfully and published the messages. 生产者成功运行并发布了消息。 But when I start the consumer, the poll call just waits indefinitely. 但是,当我启动用户时,轮询呼叫将无限期地等待。 On debugging, I found that it keeps on looping at the awaitMetadataUpdate method on the ConsumerNetworkClient. 在调试时,我发现它继续在ConsumerNetworkClient上的awaitMetadataUpdate方法上循环。

Here are the code for Producer and Consumer 这是生产者和消费者的代码

Properties properties = new Properties();
properties.put("bootstrap.servers", "localhost:9092");
properties.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
properties.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");

KafkaProducer<String, String> myProducer = new KafkaProducer<>(properties);
DateFormat dtFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss:SSS");
String topic = "replicated_topic_partitioned";

int numberOfRecords = 10;
try {
    for (int i = 0; i < numberOfRecords; i++) {
       String message = String.format("Message: %s  sent at %s", Integer.toString(i), dtFormat.format(new Date()));
       System.out.println("Sending " + message);
       myProducer.send(new ProducerRecord<String, String>(topic, message));

    }
} catch (Exception e) {
    e.printStackTrace();
} finally {
    myProducer.close();
}

Consumer.java Consumer.java

Properties properties = new Properties();
properties.put("bootstrap.servers", "localhost:9092");
properties.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");  
properties.put("group.id", UUID.randomUUID().toString());
properties.put("auto.offset.reset", "earliest");

KafkaConsumer<String, String> myConsumer = new KafkaConsumer<>(properties);

String topic = "replicated_topic_partitioned";
myConsumer.subscribe(Collections.singletonList(topic));

try {
    while (true){
        ConsumerRecords<String, String> records = myConsumer.poll(1000);
         printRecords(records);
    }
 } finally {
     myConsumer.close();
 }

Adding some key-fields from server.properties server.properties添加一些关键字段

broker.id=1 
host.name=localhost
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600

log.dirs=/tmp/kafka-logs-1
num.partitions=1
num.recovery.threads.per.data.dir=1

transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1

zookeeper.connection.timeout.ms=6000

group.initial.rebalance.delay.ms=0

The server.properties for the other two brokers was a replica of the above with broker.id, the port and thelog.dirs changed. 其他两个代理的server.properties是具有broker.id,端口和thelog.dirs更改的上述副本。

This did not work for me: Kafka 0.9.0.1 Java Consumer stuck in awaitMetadataUpdate() 这对我不起作用: Kafka 0.9.0.1 Java Consumer卡在awaitMetadataUpdate()中


But, if I start the consumer from the command line passing a partition, it successfully reads the messages for that partition. 但是,如果我从传递分区的命令行启动使用者,它将成功读取该分区的消息。 But it does not receive any message when just a topic is specified. 但是, 指定主题时,它不会收到任何消息。

Works: 作品:

kafka_2.12-2.3.0/bin/kafka-console-consumer.sh --topic replicated_topic_partitioned --bootstrap-server localhost:9092 
     --from-beginning --partition 1

Does not work: 不起作用:

kafka_2.12-2.3.0/bin/kafka-console-consumer.sh --topic replicated_topic_partitioned --bootstrap-server localhost:9092 
    --from-beginning 

NOTE: The above consumer works perfectly for a topic with replication factor equals 1. 注意:上面的使用者可以完美地用于复制因子等于1的主题。

Question: 题:

  1. Why does the Java Producer not read any message for topic with replication factor more than one (even when assigning it to a partition) (like myConsumer.assign(Collections.singletonList(new TopicPartition(topic, 2) )? 为什么Java Producer不会读取复制因子大于一个的主题消息(即使将其分配给分区)(例如myConsumer.assign(Collections.singletonList(new TopicPartition(topic, 2) ))?

  2. Why does the console consumer read message only when passed a partition (again works for a topic with replication factor of one) 为什么控制台使用者仅在通过分区时才读取消息(同样适用于复制因子为1的主题)

so, youre sending 10 records, but all 10 records have the SAME key: 因此,您发送了10条记录,但是所有10条记录都有SAME密钥:

for (int i = 0; i < numberOfRecords; i++) {
   String message = String.format("Message: %s  sent at %s", Integer.toString(i), dtFormat.format(new Date()));
   System.out.println("Sending " + message);
   myProducer.send(new ProducerRecord<String, String>(topic, message)); <--- KEY=topic
}

unless told otherwise (by setting a partition directly on the ProducerRecord ) the partition into which a record is delivered is determine by something like: 除非另有说明(通过直接在ProducerRecord上设置分区),否则将记录交付到的分区由以下方式确定:

partition = murmur2(serialize(key)) % numPartitions partition = murmur2(serialize(key))%numPartitions

so same key means same partition. 所以相同的键意味着相同的分区

have you tried searching for your 10 records on partitions 0 and 2 maybe? 您是否尝试过在分区0和2上搜索10条记录?

if you want a better "spread" of records amongst partitions, either use a null key (you'd get round robin) or a variable key. 如果您希望在分区之间更好地“分散”记录,请使用null键(循环使用)或可变键。

Disclaimer: This is not an answer. 免责声明:不是答案。

The Java consumer is now working as expected. Java使用者现在按预期工作。 I did not do any change to the code or the configuration. 我没有对代码或配置进行任何更改。 The only thing I did was to restart my Mac. 我唯一要做的就是重启Mac。 This caused the kafka-logs folder (and the zookeeper folder too I guess) to be deleted. 这导致kafka-logs文件夹(我猜也是zookeeper文件夹)被删除。

I re-created the topic (with the same command - 3 partitions, replication factor of 3). 我重新创建了主题(使用相同的命令-3个分区,复制因子为3)。 Then re-started the brokers with the same configuration - no advertised.host.name or advertised.port config. 然后使用相同的配置重新启动代理-没有advertised.host.nameadvertised.port配置。

So, recreation of the kafka-logs and topics remediated something that was causing an issue earlier. 因此,重新发布kafka日志和主题可以缓解某些引起问题的原因。


My only suspect is a non-properly terminated consumer. 我唯一的怀疑者是未正确终止的消费者。 I ran the consumer code without the close call on the consumer in the finally block initially. 最初,我运行了消费者代码,但没有在final块中close消费者。 I also had the same group.id . 我也有相同的group.id Maybe, all 3 partitions were assigned to consumers that weren't properly terminated or closed. 也许,所有3个分区都分配给未正确终止或关闭的使用者。 This is just a guess.. 这只是一个猜测。

But even calling myConsumer.position(new TopicPartition(topic, 2)) did not return a response earlier when I assigned the consumer to a partition. 但是,即使我将使用者分配给分区时myConsumer.position(new TopicPartition(topic, 2))即使调用myConsumer.position(new TopicPartition(topic, 2))也不会更早返回响应。 It was looping in the same awaitMetadataUpdate method. 它在相同的awaitMetadataUpdate方法中循环。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM