简体   繁体   English

Kafka Producer 将消息发布到单个分区

[英]Kafka Producer publishing message to single partition

I am new to Kafka and going through the official documentation that is available.我是 Kafka 的新手,正在阅读可用的官方文档。

On my local system i have started a single kafka instance alongwith the zookeeper.在我的本地系统上,我已经与 zookeeper 一起启动了一个 kafka 实例。 Zookeper and kafka server both are running on default ports. Zookeper 和 kafka 服务器都在默认端口上运行。

I have created a topic "test" with replication factor as 1 since i just have one single instance of kafka up and running.我创建了一个主题“测试”,复制因子为 1,因为我只有一个 kafka 实例启动并运行。

Alongwith it i have created two partitions.除此之外,我还创建了两个分区。

I have two consumers subscribed to this queue within the same consumer group.我有两个消费者在同一个消费者组中订阅了这个队列。

For now i have started the consumers using command prompt on windows machine.现在我已经在 windows 机器上使用命令提示符启动了消费者。

When i start the producer from command prompt and publish messages to the topic everything works fine.当我从命令提示符启动生产者并将消息发布到主题时,一切正常。 Kafka pushes the messages using round robin to both the partitions and each of the consumers receive messages alternatively since each of them is listening to separate partitions. Kafka 使用循环将消息推送到两个分区,并且每个消费者交替接收消息,因为他们每个人都在监听单独的分区。

But when i create a producer using java kafka-client jar, even though i use different keys for messages, producer pushes all the messages to the same partition there by all the messages are received on the same consumer.但是,当我使用 java kafka-client jar 创建生产者时,即使我对消息使用不同的密钥,生产者也会将所有消息推送到同一个分区,因为所有消息都是在同一个消费者上接收的。

The partition is not static as well it keeps changing everytime i run my producer.该分区不是 static 并且每次我运行我的生产者时它都会不断变化。

I have tried the same scenario with a producer started from the command prompt with exactly same configuration as i provided to kafka-client producer using java code.我已经尝试了与使用 java 代码提供给 kafka-client 生产者的配置完全相同的生产者从命令提示符开始的相同场景。 Command prompt producer seems to be working fine but code producer is pushing all the messages to same partition.命令提示符生产者似乎工作正常,但代码生产者将所有消息推送到同一分区。

I have tried changing the key of certain messages hoping broker would send it to different parition since its mentioned in the documentation that broker routes messages using the key of message.我已经尝试更改某些消息的密钥,希望代理将其发送到不同的分区,因为它在文档中提到代理使用消息的密钥路由消息。

public class KafkaProducerParallel {


public static void main(String[] args) throws InterruptedException, 
ExecutionException {

    Properties properties = new Properties();
    properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, 
"localhost:9092");
    properties.put(ProducerConfig.CLIENT_ID_CONFIG, "parallelism- 
 producer");
    properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, 
 StringSerializer.class);
    properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, 
LongSerializer.class);


    Producer<String, Long> parallelProducer = new KafkaProducer<> 
(properties);

    for(long i=0;i<100;i++) {

        ProducerRecord<String, Long> producerRecord;

        if(i<50) {
            producerRecord = new ProducerRecord<String, 
 Long>("second-topic", "Amoeba", i);
        }else {
            producerRecord = new ProducerRecord<String, 
 Long>("second-topic", "Bacteria", i);
        }

        RecordMetadata recordMetadata = 
  parallelProducer.send(producerRecord).get();

        System.out.printf("Sent record : with key %s and value 
 %d to partition %s", producerRecord.key(), producerRecord.value(), 
 recordMetadata.partition());
        System.out.println();
    }

    parallelProducer.close();


}

}

As per the documentation kafka broker decides which partition to put a particular message in by using the key (producing a hash of key).根据文档,kafka 代理通过使用密钥(生成密钥的 hash)决定将特定消息放入哪个分区。 I am changing the key of my records after an interval but still the messages are going to the same partition everytime.我在一段时间后更改了记录的键,但消息仍然每次都会发送到同一个分区。

Sample console output of code:示例控制台 output 的代码:

  Sent record : with key Amoeba and value 0 to partition 1
  Sent record : with key Amoeba and value 1 to partition 1
  Sent record : with key Amoeba and value 2 to partition 1
  Sent record : with key Amoeba and value 3 to partition 1
  Sent record : with key Amoeba and value 4 to partition 1
  Sent record : with key Amoeba and value 5 to partition 1
  Sent record : with key Amoeba and value 6 to partition 1
  Sent record : with key Amoeba and value 7 to partition 1
  Sent record : with key Amoeba and value 8 to partition 1
  Sent record : with key Amoeba and value 9 to partition 1
  Sent record : with key Amoeba and value 10 to partition 1
  Sent record : with key Amoeba and value 11 to partition 1
  Sent record : with key Amoeba and value 12 to partition 1
  Sent record : with key Amoeba and value 13 to partition 1

 Sent record : with key Bacteria and value 87 to partition 1
 Sent record : with key Bacteria and value 88 to partition 1
 Sent record : with key Bacteria and value 89 to partition 1
 Sent record : with key Bacteria and value 90 to partition 1
 Sent record : with key Bacteria and value 91 to partition 1
 Sent record : with key Bacteria and value 92 to partition 1
 Sent record : with key Bacteria and value 93 to partition 1
 Sent record : with key Bacteria and value 94 to partition 1
 Sent record : with key Bacteria and value 95 to partition 1
 Sent record : with key Bacteria and value 96 to partition 1
 Sent record : with key Bacteria and value 97 to partition 1
 Sent record : with key Bacteria and value 98 to partition 1
 Sent record : with key Bacteria and value 99 to partition 1

Everything works as expected.一切都按预期工作。

In your particular case Partitioner, that is used by KafkaProducer (to determine the partition), calculate same partition for both keys: Amoeba , Bacteria .在您的特定情况下, KafkaProducer使用的 Partitioner(用于确定分区)计算两个键的相同分区: AmoebaBacteria By default KafkaProducer uses org.apache.kafka.clients.producer.internals.DefaultPartitioner .默认情况下 KafkaProducer 使用org.apache.kafka.clients.producer.internals.DefaultPartitioner

Suggestion : Change the key or increase number of partitions.建议:更改key或增加分区数。

Notice : Producer decides to which partition put the message, not Broker.注意:生产者决定将消息放到哪个分区,而不是代理。

Change the code from Producer<String, String> producer = new KafkaProducer<String, String> to:将代码从Producer<String, String> producer = new KafkaProducer<String, String>更改为:

KafkaProducer<String, String> producer = new KafkaProducer<String, String>

By default the interface implementation places data into same partition.默认情况下,接口实现将数据放入同一个分区。 So use KafkaProducer instead of simple Producer所以使用KafkaProducer而不是简单的 Producer

From version 2.4 and later of Apache Kafka, the default partitioning strategy has been changed for records with a null key whereby sticky partitioning is the default behavior.从 Apache Kafka 版本 2.4 及更高版本开始,已更改具有 null 键的记录的默认分区策略,其中粘性分区是默认行为。

The previous round robin strategy meant that records with a null key would be split across partitions, the new sticky partitioning strategy sends records to the same partition until a partition's batch is "complete" (this is defined by batch.size or linger.ms)之前的循环策略意味着具有 null 键的记录将在分区之间拆分,新的粘性分区策略将记录发送到同一分区,直到分区的批处理“完成”(这由 batch.size 或 linger.ms 定义)

Check out this article for more info: Improvements with Sticky Partitioner查看这篇文章了解更多信息: 粘性分区的改进

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM