简体   繁体   English

Kafka Streams:处理来自不同分区的消息时的事件时间偏差

[英]Kafka Streams: event-time skew when processing messages from different partitions

Let's consider a topic with multiple partitions and messages written in event-time order without any particular partitioning scheme.让我们考虑一个具有多个分区和按事件时间顺序编写的消息的主题,没有任何特定的分区方案。 Kafka Streams application does some transformations on these messages, then groups by some key, and then aggregates messages by an event-time window with the given grace period. Kafka Streams 应用程序对这些消息进行一些转换,然后按某个键分组,然后在给定的宽限期内按事件时间窗口聚合消息。

Each task could process incoming messages at a different speed (eg, because running on servers with different performance characteristics).每个任务可以以不同的速度处理传入的消息(例如,因为在具有不同性能特征的服务器上运行)。 This means that after groupBy shuffle, event-time ordering will not be preserved between messages in the same partition of the internal topic when they originate from different tasks.这意味着在 groupBy shuffle 之后,当消息来自不同的任务时,内部主题的同一分区中的消息之间将不会保留事件时间排序。 After a while, this event-time skew could become larger than the grace period, which would lead to dropping messages originating from the lagging task.一段时间后,此事件时间偏差可能会大于宽限期,这将导致丢弃源自滞后任务的消息。

Increasing the grace period doesn't seem like a valid option because it would delay emitting the final aggregation result.增加宽限期似乎不是一个有效的选项,因为它会延迟发出最终聚合结果。 Apache Flink handles this by emitting the lowest watermark on partitions merge. Apache Flink 通过在分区合并时发出最低水印来处理这个问题。

Should it be a real concern, especially when processing large amounts of historical data, or do I miss something?它应该是一个真正的问题,尤其是在处理大量历史数据时,还是我错过了什么? Does Kafka Streams offer a solution to deal with this scenario? Kafka Streams 是否提供了处理这种情况的解决方案?

UPDATE My question is not about KStream-KStream joins but about single KStream event-time aggregation preceded by a stream shuffle.更新我的问题不是关于 KStream-KStream 连接,而是关于在流洗牌之前的单个 KStream 事件时间聚合。

Consider this code snippet:考虑这个代码片段:

stream
  .mapValues(...)
  .groupBy(...)
  .windowedBy(TimeWindows.of(Duration.ofSeconds(60)).grace(Duration.ofSeconds(10)))
  .aggregate(...)

I assume mapValues() operation could be slow for some tasks for whatever reason, and because of that tasks do process messages at a different pace.我认为 mapValues() 操作对于某些任务来说可能会因为任何原因而变慢,并且因为这些任务确实以不同的速度处理消息。 When a shuffle happens at the aggregate() operator, task 0 could have processed messages up to time t while task 1 is still at t-skew , but messages from both tasks end up interleaved in a single partition of the internal topic (corresponding to the grouping key).aggregate()操作符发生shuffle 时,任务0 可能已经处理了时间t消息,而任务1 仍处于t-skew ,但是来自两个任务的消息最终交错在内部主题的单个分区中(对应于分组键)。

My concern is that when skew is large enough (more than 10 seconds in my example), messages from the lagging task 1 will be dropped.我担心的是,当偏斜足够大时(在我的示例中超过 10 秒),来自滞后任务 1 的消息将被丢弃。

Basically, a task/processor maintains a stream-time which is defined as the highest timestamp of any record already polled.基本上,一个任务/处理器维护一个流时间,它被定义为已经轮询的任何记录的最高时间戳。 This stream-time is then used for different purpose in Kafka Streams (eg: Punctator, Windowded Aggregation, etc).然后,此流时间在 Kafka Streams 中用于不同的目的(例如:Punctator、Windowded Aggregation 等)。

[ Windowed Aggregation ] [窗口聚合]

As you mentioned, the stream-time is used to determine if a record should be accepted, ie record_accepted = end_window_time(current record) + grace_period > observed stream_time .正如您所提到的,流时间用于确定是否应接受记录,即 record_accepted = end_window_time(current record) + grace_period > observed stream_time

As you described it, if several tasks run in parallel to shuffle messages based on a grouping key, and some tasks are slower than others (or some partitions are offline) this will create out-of-order messages.正如您所描述的,如果多个任务并行运行以根据分组键对消息进行混洗,并且某些任务比其他任务慢(或某些分区处于离线状态),这将创建乱序消息。 Unfortunately, I'm afraid that the only way to deal with that is to increase the grace_period .不幸的是,恐怕解决这个问题的唯一方法是增加grace_period

This is actually the eternal trade-off between Availability and Consistency.这实际上是可用性和一致性之间永恒的权衡。

[ Behaviour for KafkaStream and KafkaStream/KTable Join [ KafkaStream 和 KafkaStream/KTable Join 的行为

When you are perfoming a join operation with Kafka Streams, an internal Task is assigned to the "same" partition over multiple co-partitioned Topics.当您使用 Kafka Streams 执行连接操作时,一个内部任务被分配到多个共同分区的主题上的“相同”分区。 For example the Task 0 will be assigned to Topic1-Partition0 and TopicB-Partition0.例如,任务 0 将分配给 Topic1-Partition0 和 TopicB-Partition0。

The fetched records are buffered per partition into internal queues that are managed by Tasks.获取的记录按分区缓冲到由任务管理的内部队列中。 So, each queue contains all records for a single partition waiting for processing.因此,每个队列都包含等待处理的单个分区的所有记录。

Then, records are polled one by one from queues and processed by the topology instance.然后,记录从队列中一一轮询并由拓扑实例处理。 But, this is the record from the non-empty queue having the lowest timestamp which is returned from the polled.但是,这是来自非空队列的记录,具有从轮询返回的最低时间戳。

In addition, if a queue is empty, the task may become idle during a period of time so that no more records are polled from queue.此外,如果队列为空,任务可能会在一段时间内空闲,从而不再从队列中轮询记录。 You can actually configure the maximum amount of time a Task will stay idle can be defined with the stream config : max.task.idle.ms您实际上可以配置任务保持空闲的最长时间可以使用流配置定义: max.task.idle.ms

This mecanism allows synchronizing co-localized partitions.这种机制允许同步共定位的分区。 Bu, default the max.task.idle.ms is set to 0. This means a Task will never wait for more data from a partition which may lead to records being filtered because the stream-time will potentially increase more quickly.但是,默认max.task.idle.ms设置为 0。这意味着 Task 永远不会等待来自分区的更多数据,这可能导致记录被过滤,因为流时间可能会增加得更快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用Kafka Streams DSL的两个Kafka主题的事件时间合并 - Event-Time merge of two Kafka topics using Kafka Streams DSL 如何忽略Kafka Streams应用程序中的某些消息,该应用程序读取和写入同一主题的不同事件类型 - How to ignore some kinds of messages in a Kafka Streams Application that reads and writes different event types from the same topic Apache Flink:将事件时间与多个Kafka分区一起使用时无输出 - Apache Flink: No output when using event time with multiple Kafka partitions Kafka Streams:从具有多个分区的主题中读取 - Kafka Streams: Reading from topic with multiple partitions 如果我们添加新分区,我们会在 Kafka Streams 中丢失消息吗? - Can we lose messages in Kafka Streams if we add new partitions? 使用Kafka流处理复杂的Avro消息 - Processing Complex Avro messages using Kafka Streams Kafka Consumer没有使用所有分区中的消息 - Kafka Consumer is not consuming messages from all partitions Kafka Consumer 不消费来自所有分区的消息 - Kafka Consumer not consuming messages from all partitions 具有相同密钥但来自不同Kafka生产者的消息是否会转到不同的分区? - Do the messages with the same key but from different Kafka producers go to different partitions? Kafka Streams在处理时间窗口内排序 - Kafka Streams Sort Within Processing Time Window
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM