[英]Apache Kafka Java consumer does not receive message for topic with replication factor more than one
I'm starting on Apache Kakfa with a simple Producer, Consumer app in Java. 我从Apache Kakfa开始,它带有一个简单的Java生产者,消费者应用程序。 I'm using
kafka-clients
version 0.10.0.1
and running it on a Mac. 我正在使用
kafka-clients
版本0.10.0.1
并在Mac上运行它。
I created a topic named replicated_topic_partitioned
with 3 partitions and with replication factor as 3. 我创建了一个名为
replicated_topic_partitioned
的主题,其中包含3个分区,复制因子为3。
I started the zookeeper at port 2181. I started three brokers with id 1, 2 and 3 on ports 9092, 9093 and 9094 respectively. 我从2181端口启动了Zookeeper。我分别在9092、9093和9094端口上启动了ID为1、2和3的三个代理。
Here's the output of the describe command 这是describe命令的输出
kafka_2.12-2.3.0/bin/kafka-topics.sh --describe --topic replicated_topic_partitioned --bootstrap-server localhost:9092
Topic:replicated_topic_partitioned PartitionCount:3 ReplicationFactor:3 Configs:segment.bytes=1073741824
Topic: replicated_topic_partitioned Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
Topic: replicated_topic_partitioned Partition: 1 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
Topic: replicated_topic_partitioned Partition: 2 Leader: 2 Replicas: 2,3,1 Isr: 2,3,1
I wrote a simple producer and a consumer code. 我写了一个简单的生产者和消费者代码。 The producer ran successfully and published the messages.
生产者成功运行并发布了消息。 But when I start the consumer, the poll call just waits indefinitely.
但是,当我启动用户时,轮询呼叫将无限期地等待。 On debugging, I found that it keeps on looping at the awaitMetadataUpdate method on the ConsumerNetworkClient.
在调试时,我发现它继续在ConsumerNetworkClient上的awaitMetadataUpdate方法上循环。
Here are the code for Producer and Consumer 这是生产者和消费者的代码
Properties properties = new Properties();
properties.put("bootstrap.servers", "localhost:9092");
properties.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
properties.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
KafkaProducer<String, String> myProducer = new KafkaProducer<>(properties);
DateFormat dtFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss:SSS");
String topic = "replicated_topic_partitioned";
int numberOfRecords = 10;
try {
for (int i = 0; i < numberOfRecords; i++) {
String message = String.format("Message: %s sent at %s", Integer.toString(i), dtFormat.format(new Date()));
System.out.println("Sending " + message);
myProducer.send(new ProducerRecord<String, String>(topic, message));
}
} catch (Exception e) {
e.printStackTrace();
} finally {
myProducer.close();
}
Consumer.java Consumer.java
Properties properties = new Properties();
properties.put("bootstrap.servers", "localhost:9092");
properties.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.put("group.id", UUID.randomUUID().toString());
properties.put("auto.offset.reset", "earliest");
KafkaConsumer<String, String> myConsumer = new KafkaConsumer<>(properties);
String topic = "replicated_topic_partitioned";
myConsumer.subscribe(Collections.singletonList(topic));
try {
while (true){
ConsumerRecords<String, String> records = myConsumer.poll(1000);
printRecords(records);
}
} finally {
myConsumer.close();
}
Adding some key-fields from server.properties
从
server.properties
添加一些关键字段
broker.id=1
host.name=localhost
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs-1
num.partitions=1
num.recovery.threads.per.data.dir=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
The server.properties for the other two brokers was a replica of the above with broker.id, the port and thelog.dirs changed. 其他两个代理的server.properties是具有broker.id,端口和thelog.dirs更改的上述副本。
This did not work for me: Kafka 0.9.0.1 Java Consumer stuck in awaitMetadataUpdate() 这对我不起作用: Kafka 0.9.0.1 Java Consumer卡在awaitMetadataUpdate()中
But, if I start the consumer from the command line passing a partition, it successfully reads the messages for that partition. 但是,如果我从传递分区的命令行启动使用者,它将成功读取该分区的消息。 But it does not receive any message when just a topic is specified.
但是, 仅指定主题时,它不会收到任何消息。
Works: 作品:
kafka_2.12-2.3.0/bin/kafka-console-consumer.sh --topic replicated_topic_partitioned --bootstrap-server localhost:9092
--from-beginning --partition 1
Does not work: 不起作用:
kafka_2.12-2.3.0/bin/kafka-console-consumer.sh --topic replicated_topic_partitioned --bootstrap-server localhost:9092
--from-beginning
NOTE: The above consumer works perfectly for a topic with replication factor equals 1. 注意:上面的使用者可以完美地用于复制因子等于1的主题。
Question: 题:
Why does the Java Producer not read any message for topic with replication factor more than one (even when assigning it to a partition) (like myConsumer.assign(Collections.singletonList(new TopicPartition(topic, 2)
)? 为什么Java Producer不会读取复制因子大于一个的主题消息(即使将其分配给分区)(例如
myConsumer.assign(Collections.singletonList(new TopicPartition(topic, 2)
))?
Why does the console consumer read message only when passed a partition (again works for a topic with replication factor of one) 为什么控制台使用者仅在通过分区时才读取消息(同样适用于复制因子为1的主题)
so, youre sending 10 records, but all 10 records have the SAME key: 因此,您发送了10条记录,但是所有10条记录都有SAME密钥:
for (int i = 0; i < numberOfRecords; i++) {
String message = String.format("Message: %s sent at %s", Integer.toString(i), dtFormat.format(new Date()));
System.out.println("Sending " + message);
myProducer.send(new ProducerRecord<String, String>(topic, message)); <--- KEY=topic
}
unless told otherwise (by setting a partition directly on the ProducerRecord
) the partition into which a record is delivered is determine by something like: 除非另有说明(通过直接在
ProducerRecord
上设置分区),否则将记录交付到的分区由以下方式确定:
partition = murmur2(serialize(key)) % numPartitions
partition = murmur2(serialize(key))%numPartitions
so same key means same partition. 所以相同的键意味着相同的分区
have you tried searching for your 10 records on partitions 0 and 2 maybe? 您是否尝试过在分区0和2上搜索10条记录?
if you want a better "spread" of records amongst partitions, either use a null key (you'd get round robin) or a variable key. 如果您希望在分区之间更好地“分散”记录,请使用null键(循环使用)或可变键。
Disclaimer: This is not an answer. 免责声明:这不是答案。
The Java consumer is now working as expected. Java使用者现在按预期工作。 I did not do any change to the code or the configuration.
我没有对代码或配置进行任何更改。 The only thing I did was to restart my Mac.
我唯一要做的就是重启Mac。 This caused the
kafka-logs
folder (and the zookeeper
folder too I guess) to be deleted. 这导致
kafka-logs
文件夹(我猜也是zookeeper
文件夹)被删除。
I re-created the topic (with the same command - 3 partitions, replication factor of 3). 我重新创建了主题(使用相同的命令-3个分区,复制因子为3)。 Then re-started the brokers with the same configuration - no
advertised.host.name
or advertised.port
config. 然后使用相同的配置重新启动代理-没有
advertised.host.name
或advertised.port
配置。
So, recreation of the kafka-logs and topics remediated something that was causing an issue earlier. 因此,重新发布kafka日志和主题可以缓解某些引起问题的原因。
My only suspect is a non-properly terminated consumer. 我唯一的怀疑者是未正确终止的消费者。 I ran the consumer code without the
close
call on the consumer in the finally block initially. 最初,我运行了消费者代码,但没有在final块中
close
消费者。 I also had the same group.id
. 我也有相同的
group.id
。 Maybe, all 3 partitions were assigned to consumers that weren't properly terminated or closed. 也许,所有3个分区都分配给未正确终止或关闭的使用者。 This is just a guess..
这只是一个猜测。
But even calling myConsumer.position(new TopicPartition(topic, 2))
did not return a response earlier when I assigned the consumer to a partition. 但是,即使我将使用者分配给分区时
myConsumer.position(new TopicPartition(topic, 2))
即使调用myConsumer.position(new TopicPartition(topic, 2))
也不会更早返回响应。 It was looping in the same awaitMetadataUpdate
method. 它在相同的
awaitMetadataUpdate
方法中循环。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.