简体   繁体   English

如何关联来自多个主题的 kafka 事件?

[英]How to correlate kafka events from multiple topics?

I have a springboot application that consume from topic A and from topic B. Events from topic A are used to create entries in the db and events from topic B use this information to perform a task.我有一个从主题 A 和主题 B 消费的 springboot 应用程序。来自主题 A 的事件用于在数据库中创建条目,来自主题 B 的事件使用此信息来执行任务。 As one suspects, events from topic B relay on events from topic A arriving first & creating entries in a db.正如一个怀疑的那样,来自主题 B 的事件中继来自主题 A 的事件首先到达并在数据库中创建条目。 So far I would just get events from topic A first and then get events from topic B after all the events from topic A were processed.到目前为止,我只会先从主题 A 获取事件,然后在处理完来自主题 A 的所有事件后从主题 B 获取事件。 So my events will come in the following order:所以我的事件将按以下顺序排列:

eventA1
eventA2
eventA3
eventB1 (would look for entry created by A1)
eventB2 (would look for entry created by A2)
eventB3 (would look for entry created by A3)

However, say I cannot relay on topic B sending events after topic A. In that case I can face a situation where events from topic B arrive first (and cannot do anything since events from topic A will come afterwards) as so:但是,假设我不能在主题 A 之后转发主题 B 发送事件。在这种情况下,我可能会遇到来自主题 B 的事件首先到达的情况(并且不能做任何事情,因为来自主题 A 的事件将随后出现),如下所示:

eventB2 (would look for entry created by A2 but it has not been received yet)
eventB3 (would look for entry created by A3 but it has not been received yet)
eventA1
eventA2
eventB1 (would look for entry created by A1)
eventA3

So far my solution is to store events from topic B somewhere and use scheduler to periodically check whether the events from topic A arrived yet, so that events from topic B can do their work.到目前为止,我的解决方案是将来自主题 B 的事件存储在某处,并使用调度程序定期检查来自主题 A 的事件是否已经到达,以便来自主题 B 的事件可以完成它们的工作。 That seems very not-ideal (and while I cannot put it in words, my intuition tells me that I can encounter a number of problems with this approach) and adds a fair share of complexity.这似乎非常不理想(虽然我无法用语言表达,但我的直觉告诉我,这种方法可能会遇到许多问题)并且增加了相当多的复杂性。

Something similar has been asked here and it seems that the answer is - just don't. 这里有人问过类似的问题,答案似乎是 - 只是不要。 Combine messages into one and don't mess around with joining streams.将消息合二为一,不要乱加入流。 While I wish I could do that, unfortunately I can't.虽然我希望我能做到这一点,但不幸的是我做不到。

Another interesting idea mentioned here is to feed topic A into topic B and then consume the topic containing both pieces. 这里提到的另一个有趣的想法是将主题 A 提供给主题 B,然后使用包含这两个部分的主题。 But it something I again cannot do unfortunately.但不幸的是,我再次无法做到这一点。

So my question is as follows: how can I make it so that if event from topic B arrived, and there is no event from topic A, it will wait until the relevant event from topic A is received?所以我的问题如下:我怎样才能做到,如果来自主题 B 的事件到达,并且没有来自主题 A 的事件,它会等到收到来自主题 A 的相关事件? Is this even possible without needing to temp store events from topic B somewhere and create a scheduler?这甚至可能不需要从某个地方的主题 B 临时存储事件并创建调度程序吗? Is it possible to somehow tell kafka topics to wait for each other (events from both topics have a field that can be used to figure out which 2 event need each other)?是否有可能以某种方式告诉 kafka 主题彼此等待(来自两个主题的事件都有一个字段可用于确定哪两个事件彼此需要)?

You can use the non-blocking retries feature that Spring for Apache Kafka provides:您可以使用 Spring for Apache Kafka 提供的非阻塞重试功能:

https://docs.spring.io/spring-kafka/docs/current/reference/html/#retry-topic https://docs.spring.io/spring-kafka/docs/current/reference/html/#retry-topic

If a B arrives too soon, throw an exception and it will move to the next topic in the retry sequence and won't be delivered again until the delay expires.如果一个 B 来得太快,抛出一个异常,它将移动到重试序列中的下一个主题,直到延迟到期才会再次投递。

It is similar to what you are doing now, but all taken care of by the framework.它类似于您现在正在做的事情,但都由框架处理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM