简体   繁体   中英

Kafka ordering with multiple producers on same topic and parititon

Let's say I have two producers (ProducerA and ProducerB) writing to the same topic with a single partition. Each producer is writing it's own unique events serially. So if ProducerA fired 3 events and then ProducerB fired 3 events, my understanding is that Kafka cannot guarantee the order across the producer's events like this:

  1. ProducerA_event_1
  2. ProducerA_event_2
  3. ProducerA_event_3
  4. ProducerB_event_1
  5. ProducerB_event_2
  6. ProducerB_event_3

due to acking, retrying, etc.

However will individual producer's events still be in order? For example:

  1. ProducerA_event_1
  2. ProducerB_event_2
  3. ProducerB_event_1
  4. ProducerA_event_2
  5. ProducerA_event_3
  6. ProducerB_event_3

This is of course a simplified version of what I am doing, but I just want to guarantee that if I am reading from a topic for a specific producer's events, then those events will be in order even if other producer's events interleave them.

There is a nice article on medium which states that Kafka does not always guarantee the message ordering even for the same producer. It all depends on the Kafka configuration. In particular, max.in.flight.requests.per.connection has to be set to 1 . The reason is if there are multiple requests (say, 2) in flight and the first one failed, the second will get appended to the log earlier, thus breaking the ordering.

Short answer to this one is Yes, the individual producer's events will be guaranteed to be in order.

Messages in Kafka are appended to a topic partition in the order they are sent and the consumers read the messages in the same order they are stored in the topic partition.

So assuming if you are interested in the messages from Producer A and are filtering everything else, then in the given scenario, you can expect the events 1, 2 and 3 from Producer A to be read in the order.

PS : I am however curious to understand the motivation behind using just one partition. Also, on your statement:

So if ProducerA fired 3 events and then ProducerB fired 3 events, my understanding is that Kafka cannot guarantee the order across the producer's events like this:

You are correct in saying that the overall ordering is something that cannot be guaranteed but ordering within a partition can be guaranteed.

I hope this helps.

A producer's messages will be stored, per partition, in the order they are received. If you can guarantee message ordering on the producer, then consumers can assume ordering when polling. Retry logic, multiple KafkaProducer instances, and other asynchronous implementation details might complicate ordered message production. Often these can be mitigated by including a unique event identifier, an identifier of the producer, and a timestamp of sufficient granularity either in the key or value of the message. Relying on ordering in an asynchronous framework is often a best case flow but there should be some way to compensate when things come in out of order.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM