简体   繁体   English

Kafka 在同一主题和分区上与多个生产者进行排序

[英]Kafka ordering with multiple producers on same topic and parititon

Let's say I have two producers (ProducerA and ProducerB) writing to the same topic with a single partition.假设我有两个生产者(ProducerA 和 ProducerB)使用单个分区写入同一主题。 Each producer is writing it's own unique events serially.每个生产者都在连续编写自己独特的事件。 So if ProducerA fired 3 events and then ProducerB fired 3 events, my understanding is that Kafka cannot guarantee the order across the producer's events like this:因此,如果 ProducerA 触发 3 个事件,然后 ProducerB 触发 3 个事件,我的理解是 Kafka 无法保证生产者事件的顺序如下:

  1. ProducerA_event_1 ProducerA_event_1
  2. ProducerA_event_2 ProducerA_event_2
  3. ProducerA_event_3 ProducerA_event_3
  4. ProducerB_event_1生产者B_event_1
  5. ProducerB_event_2 ProducerB_event_2
  6. ProducerB_event_3 ProducerB_event_3

due to acking, retrying, etc.由于确认,重试等。

However will individual producer's events still be in order?然而,个别制作人的活动仍会正常进行吗? For example:例如:

  1. ProducerA_event_1 ProducerA_event_1
  2. ProducerB_event_2 ProducerB_event_2
  3. ProducerB_event_1生产者B_event_1
  4. ProducerA_event_2 ProducerA_event_2
  5. ProducerA_event_3 ProducerA_event_3
  6. ProducerB_event_3 ProducerB_event_3

This is of course a simplified version of what I am doing, but I just want to guarantee that if I am reading from a topic for a specific producer's events, then those events will be in order even if other producer's events interleave them.这当然是我正在做的事情的简化版本,但我只想保证,如果我正在阅读特定生产者事件的主题,那么即使其他生产者的事件将它们交错,这些事件也将是有序的。

There is a nice article on medium which states that Kafka does not always guarantee the message ordering even for the same producer.在 medium 上有一篇很好的文章指出,即使对于同一个生产者,Kafka 也并不总是保证消息的顺序。 It all depends on the Kafka configuration.这一切都取决于 Kafka 配置。 In particular, max.in.flight.requests.per.connection has to be set to 1 .特别是, max.in.flight.requests.per.connection必须设置为1 The reason is if there are multiple requests (say, 2) in flight and the first one failed, the second will get appended to the log earlier, thus breaking the ordering.原因是如果有多个请求(比如 2 个)在进行中并且第一个失败,第二个将更早地附加到日志中,从而破坏排序。

Short answer to this one is Yes, the individual producer's events will be guaranteed to be in order.对此的简短回答是肯定的,将保证各个制作人的活动井然有序。

Messages in Kafka are appended to a topic partition in the order they are sent and the consumers read the messages in the same order they are stored in the topic partition. Kafka 中的消息按照它们发送的顺序附加到主题分区,消费者按照它们存储在主题分区中的相同顺序读取消息。

So assuming if you are interested in the messages from Producer A and are filtering everything else, then in the given scenario, you can expect the events 1, 2 and 3 from Producer A to be read in the order.因此,假设您对来自生产者 A 的消息感兴趣并且正在过滤其他所有内容,那么在给定的场景中,您可以预期来自生产者 A 的事件 1、2 和 3 将按顺序读取。

PS : I am however curious to understand the motivation behind using just one partition. PS :不过,我很想了解仅使用一个分区背后的动机。 Also, on your statement:另外,关于你的声明:

So if ProducerA fired 3 events and then ProducerB fired 3 events, my understanding is that Kafka cannot guarantee the order across the producer's events like this:因此,如果 ProducerA 触发 3 个事件,然后 ProducerB 触发 3 个事件,我的理解是 Kafka 无法保证生产者事件的顺序如下:

You are correct in saying that the overall ordering is something that cannot be guaranteed but ordering within a partition can be guaranteed.您说整体排序是正确的,无法保证,但可以保证分区内的排序。

I hope this helps.我希望这有帮助。

A producer's messages will be stored, per partition, in the order they are received.生产者的消息将按照接收到的顺序按分区存储。 If you can guarantee message ordering on the producer, then consumers can assume ordering when polling.如果您可以保证生产者的消息排序,那么消费者在轮询时就可以假设排序。 Retry logic, multiple KafkaProducer instances, and other asynchronous implementation details might complicate ordered message production.重试逻辑、多个KafkaProducer实例和其他异步实现细节可能会使有序消息生产复杂化。 Often these can be mitigated by including a unique event identifier, an identifier of the producer, and a timestamp of sufficient granularity either in the key or value of the message.通常可以通过在消息的键或值中包含唯一的事件标识符、生产者的标识符和足够粒度的时间戳来缓解这些问题。 Relying on ordering in an asynchronous framework is often a best case flow but there should be some way to compensate when things come in out of order.依赖于异步框架中的排序通常是最好的案例流程,但是当事情发生混乱时应该有一些方法来补偿。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Kafka - 编写同一主题和消息顺序的多个制作人很重要 - Kafka - Multiple producers writing to same topic and order of message is important 物联网 - 多个 Kafka 生产者将消息发布到同一主题 - IoT - multiple Kafka producers to publish messages to same topic 多个 Kafka 生产者写入同一主题 - 如何负载平衡消费 - Multiple Kafka Producers writing to the same topic - how to load balance consumption 如果我有多个生产者向 Kafka 生产相同的数据,为每个生产者配置一个主题或为所有生产者配置一个主题是否有效 - If I have multiple producers producing same data to Kafka,Is it efficient to configure one topic for each producer or one topic for all producers Kafka多个制作人写同一主题 - 消息和数据突发的排序 - Kafka multiple producer writing to same topic - Ordering of message and data burst 卡夫卡多个生产者到特定的相同分区 - Kafka multiple producers to specific same partition 了解 kafka 主题的现有生产者 - Know existing producers for a kafka topic 在不同服务器上有多个生产者写同一主题是否可以接受? - Is it acceptable to have multiple producers on different servers writing to the same topic? 如何列出写入某个 kafka 主题的生产者 - How to list producers writing to a certain kafka topic 生产者可以在恢复过程中写Kafka主题吗? - Can producers write to Kafka topic during recovery?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM