简体   繁体   English

具有多个 output 主题的 Kafka 流拓扑的并发性

[英]Concurrency of Kafka streams topology with multiple output topics

Given a Kafka streams topology which publishes messages to two different topics, are there any guarantees in which order various steps will be executed in these two branches or are those branches separated completely and executed in parallel?给定一个将消息发布到两个不同主题的 Kafka 流拓扑,是否可以保证在这两个分支中执行各个步骤的顺序,或者这些分支是否完全分开并并行执行?

Example例子

    KStream<..., ...> filteredStream = builder.stream("input-topic", ...).filter(...)...;

    filteredStream.mapValues(this::mapOne).to("output-topic-one", ...);
    filteredStream.flatMap(this::mapTwo).to("output-topic-two", ...);

In this example, will mapOne executed and publishing to output-topic- one be done before mapTwo is even getting called or messages are published to output-topic- two ?在此示例中, mapTwo是否会mapOne被调用或消息发布到output-topic- 2之前执行并发布到output -topic -one ? In other words, is there a guarantee that mapOne will be finished before messages are published to output-topic- two ?换句话说,是否可以保证mapOne消息发布到output-topic- 2之前完成?

Topology Visualization拓扑可视化

When looking at the visualization of the topology description (see at the bottom; made with https://zz85.github.io/kafka-streams-viz/ ) you can see the two branches.查看拓扑描述的可视化时(见底部;使用https://zz85.github.io/kafka-streams-viz/ 制作),您可以看到两个分支。 But you can also see these numbers in each bubble which might also indicate that there is an order of execution (1-4, then 5-6-7, then 8-9).但是您也可以在每个气泡中看到这些数字,这也可能表明存在执行顺序(1-4,然后是 5-6-7,然后是 8-9)。

kafka流拓扑

Topology Description拓扑描述

Topologies:
   Sub-topology: 0
    Source: KSTREAM-SOURCE-0000000000 (topics: [input-topic])
      --> KSTREAM-FILTER-0000000001
    Processor: KSTREAM-FILTER-0000000001 (stores: [])
      --> KSTREAM-FILTER-0000000002
      <-- KSTREAM-SOURCE-0000000000
    Processor: KSTREAM-FILTER-0000000002 (stores: [])
      --> KSTREAM-MAP-0000000003
      <-- KSTREAM-FILTER-0000000001
    Processor: KSTREAM-MAP-0000000003 (stores: [])
      --> KSTREAM-FILTER-0000000004
      <-- KSTREAM-FILTER-0000000002
    Processor: KSTREAM-FILTER-0000000004 (stores: [])
      --> KSTREAM-MAPVALUES-0000000005, KSTREAM-FLATMAP-0000000008
      <-- KSTREAM-MAP-0000000003
    Processor: KSTREAM-MAPVALUES-0000000005 (stores: [])
      --> KSTREAM-FILTER-0000000006
      <-- KSTREAM-FILTER-0000000004
    Processor: KSTREAM-FILTER-0000000006 (stores: [])
      --> KSTREAM-SINK-0000000007
      <-- KSTREAM-MAPVALUES-0000000005
    Processor: KSTREAM-FLATMAP-0000000008 (stores: [])
      --> KSTREAM-SINK-0000000009
      <-- KSTREAM-FILTER-0000000004
    Sink: KSTREAM-SINK-0000000007 (topic: output-topic-one)
      <-- KSTREAM-FILTER-0000000006
    Sink: KSTREAM-SINK-0000000009 (topic: output-topic-two)
      <-- KSTREAM-FLATMAP-0000000008

Kafka streams always guarantee the Topology order. Kafka 流始终保证拓扑顺序。 It is always passing the message in a topology, that topology has edges and nodes.它总是在拓扑中传递消息,该拓扑具有边和节点。 Those edges and nodes added to the topology as you define it in the application.当您在应用程序中定义拓扑时,这些边和节点会添加到拓扑中。

In your case filtered stream go through the map values branch in the topology until that path is end (in your case sink -> topic one).在您的情况下,通过map values branch filtered stream go 直到该路径结束(在您的情况下为接收器 - >主题一)。

Then it continue with flat map branch .然后它继续flat map branch until the sink to topic two.直到下沉到话题二。

It is ordered correctly with that IDs.它使用该 ID 正确排序。

0000000004 -> 0000000005 -> 0000000006 -> 0000000007 0000000004 -> 0000000005 -> 0000000006 -> 0000000007

0000000004 -> 0000000008 -> 0000000009 0000000004 -> 0000000008 -> 0000000009

For more information go through the Kafka source code internal topology builder更多信息 go 通过Kafka源代码内部拓扑构建器

And refer this并参考这个

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM