[英]Concurrency of Kafka streams topology with multiple output topics
Given a Kafka streams topology which publishes messages to two different topics, are there any guarantees in which order various steps will be executed in these two branches or are those branches separated completely and executed in parallel?给定一个将消息发布到两个不同主题的 Kafka 流拓扑,是否可以保证在这两个分支中执行各个步骤的顺序,或者这些分支是否完全分开并并行执行?
KStream<..., ...> filteredStream = builder.stream("input-topic", ...).filter(...)...;
filteredStream.mapValues(this::mapOne).to("output-topic-one", ...);
filteredStream.flatMap(this::mapTwo).to("output-topic-two", ...);
In this example, will mapOne
executed and publishing to output-topic- one be done before mapTwo
is even getting called or messages are published to output-topic- two ?在此示例中, mapTwo
是否会在mapOne
被调用或消息发布到output-topic- 2之前执行并发布到output -topic -one ? In other words, is there a guarantee that mapOne
will be finished before messages are published to output-topic- two ?换句话说,是否可以保证mapOne
在消息发布到output-topic- 2之前完成?
When looking at the visualization of the topology description (see at the bottom; made with https://zz85.github.io/kafka-streams-viz/ ) you can see the two branches.查看拓扑描述的可视化时(见底部;使用https://zz85.github.io/kafka-streams-viz/ 制作),您可以看到两个分支。 But you can also see these numbers in each bubble which might also indicate that there is an order of execution (1-4, then 5-6-7, then 8-9).但是您也可以在每个气泡中看到这些数字,这也可能表明存在执行顺序(1-4,然后是 5-6-7,然后是 8-9)。
Topologies:
Sub-topology: 0
Source: KSTREAM-SOURCE-0000000000 (topics: [input-topic])
--> KSTREAM-FILTER-0000000001
Processor: KSTREAM-FILTER-0000000001 (stores: [])
--> KSTREAM-FILTER-0000000002
<-- KSTREAM-SOURCE-0000000000
Processor: KSTREAM-FILTER-0000000002 (stores: [])
--> KSTREAM-MAP-0000000003
<-- KSTREAM-FILTER-0000000001
Processor: KSTREAM-MAP-0000000003 (stores: [])
--> KSTREAM-FILTER-0000000004
<-- KSTREAM-FILTER-0000000002
Processor: KSTREAM-FILTER-0000000004 (stores: [])
--> KSTREAM-MAPVALUES-0000000005, KSTREAM-FLATMAP-0000000008
<-- KSTREAM-MAP-0000000003
Processor: KSTREAM-MAPVALUES-0000000005 (stores: [])
--> KSTREAM-FILTER-0000000006
<-- KSTREAM-FILTER-0000000004
Processor: KSTREAM-FILTER-0000000006 (stores: [])
--> KSTREAM-SINK-0000000007
<-- KSTREAM-MAPVALUES-0000000005
Processor: KSTREAM-FLATMAP-0000000008 (stores: [])
--> KSTREAM-SINK-0000000009
<-- KSTREAM-FILTER-0000000004
Sink: KSTREAM-SINK-0000000007 (topic: output-topic-one)
<-- KSTREAM-FILTER-0000000006
Sink: KSTREAM-SINK-0000000009 (topic: output-topic-two)
<-- KSTREAM-FLATMAP-0000000008
Kafka streams always guarantee the Topology order. Kafka 流始终保证拓扑顺序。 It is always passing the message in a topology, that topology has edges and nodes.它总是在拓扑中传递消息,该拓扑具有边和节点。 Those edges and nodes added to the topology as you define it in the application.当您在应用程序中定义拓扑时,这些边和节点会添加到拓扑中。
In your case filtered stream
go through the map values branch
in the topology until that path is end (in your case sink -> topic one).在您的情况下,通过map values branch
filtered stream
go 直到该路径结束(在您的情况下为接收器 - >主题一)。
Then it continue with flat map branch
.然后它继续flat map branch
。 until the sink to topic two.直到下沉到话题二。
It is ordered correctly with that IDs.它使用该 ID 正确排序。
0000000004
-> 0000000005
-> 0000000006
-> 0000000007
0000000004
-> 0000000005
-> 0000000006
-> 0000000007
0000000004
-> 0000000008
-> 0000000009
0000000004
-> 0000000008
-> 0000000009
For more information go through the Kafka source code internal topology builder更多信息 go 通过Kafka源代码内部拓扑构建器
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.