简体   繁体   English

如何通过 user_id 对 Kafka 中的主题进行分区?

[英]How to make a partition of a topic in Kafka by user_id?

I'm building a web app backend with SpringBoot and I have to use Kafka for sending messages.我正在使用 SpringBoot 构建 Web 应用程序后端,并且必须使用 Kafka 来发送消息。 I want to have a topic for example "testTopic" and I want to produce there some messages from differents users to send the messages later to differents machines.我想要一个主题,例如“testTopic”,我想在那里产生一些来自不同用户的消息,以便稍后将消息发送到不同的机器。

If the user A sends a message to his machine and user B sends a message to his machine.如果用户 A 向他的机器发送一条消息,而用户 B 向他的机器发送一条消息。 How can I differentiate who has sent which message and to which machine it should arrive?如何区分谁发送了哪条消息以及它应该到达哪台机器?

I've read about Kafka topic partitions but I don't know if Im doing it well in my code.我已经阅读了有关 Kafka 主题分区的信息,但我不知道我是否在我的代码中做得很好。

Here I'm building my topic在这里,我正在构建我的主题

    @Bean
public NewTopic kafkaExampleTopic() {
    return TopicBuilder.name("TestTopic").partitions(1).build();
}

Here I'm sending data to that topic在这里,我正在向该主题发送数据

    @Bean
CommandLineRunner commandLineRunner(KafkaTemplate<String, String> kafkaTemplate) {
    return args -> {
        kafkaTemplate.send("TestTopic", String.valueOf(MessageBuilder.withPayload("Hello kafka testTopic uno con key 1")
                .setHeader(KafkaHeaders.MESSAGE_KEY, "1").build()));
        kafkaTemplate.send("TestTopic", String.valueOf(MessageBuilder.withPayload("Hello kafka testTopic uno con key 2")
                .setHeader(KafkaHeaders.MESSAGE_KEY, "2").build()));
    };
}

And this is my listener这是我的听众

    @KafkaListener(topics = "TestTopic", groupId = "exampleGroupId")
public void listenWithHeaders(
        @Payload String message,
        @Header(KafkaHeaders.RECEIVED_PARTITION_ID) int partition) {
    System.out.println(
            "Received Message: " + message
                    + "from partition: " + partition);
}

Thank you so much guys!十分感谢大家!

Topic Partitions need to be decided ahead of time.主题分区需要提前确定。

For example, if you have numeric ids, you could define a topic with ten partitions, then create your own Partitioner class that will route every number into the partition based on its leading digit (ids 1, 10, 15, etc all go to partition 1).例如,如果您有数字 id,您可以定义一个包含十个分区的主题,然后创建您自己的 Partitioner 类,该类将根据其前导数字将每个数字路由到分区中(id 1、10、15 等都进入分区1)。 If you use hexadecimal values (such as UUID), maybe a topic with 16 partitions (af, 0-9).如果您使用十六进制值(例如 UUID),则可能是具有 16 个分区(af,0-9)的主题。 Alphanumeric-lowercase - 36, and so on.字母数字小写 - 36,依此类推。

By default, Kafka's DefaultPartitioner will perform a Murmur2 hash modulo-d by the number of topic partitions.默认情况下,Kafka 的 DefaultPartitioner 将按照主题分区的数量执行 Murmur2 哈希模 d。 With that, it's possible that ids 5 and 7, for example, could end up in the same partition.这样,例如 ids 5 和 7 可能最终会出现在同一个分区中。 Depending on your consumer's needs, that might not be what you want.根据您消费者的需求,这可能不是您想要的。

Consumers are what run on the different machines.消费者是在不同机器上运行的东西。 The partitions shouldn't matter except to know that consumers of the same group cannot be assigned the same partitions (if you only have one partition, only one consumer per group can read it).除了知道不能为同一组的消费者分配相同的分区外,分区应该无关紧要(如果您只有一个分区,则每个组只有一个消费者可以读取它)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM