简体   繁体   English

如何加入多个Kafka主题?

[英]How to join multiple Kafka topics?

So I have...所以我有...

  • 1st topic that has general application logs (log4j).具有一般应用程序日志 (log4j) 的第一个主题。 Stores things like HTTP API requests/responses and warnings, exceptions etc... There can be multiple logs associated to one logical business request.存储诸如 HTTP API 请求/响应和警告、异常等内容……可以有多个日志关联到一个逻辑业务请求。 (These logs happen within seconds of each other) (这些日志彼此之间发生在几秒钟内)
  • 2nd topic contains commands from the above business request which other services take action on.第二个主题包含来自上述业务请求的命令,其他服务将对其执行操作。 (The commands also happen within seconds of each other, but maybe couple minutes from the original request) (这些命令也会在几秒钟内发生,但可能会从原始请求开始几分钟)
  • 3rd topic contains events generated from actions of those other services.第三个主题包含从其他服务的操作生成的事件。 (Most events complete within seconds, but some can take up to 3-5 days to be received) (大多数事件在几秒钟内完成,但有些可能需要 3-5 天才能收到)

So a single logical business request can have multiple logs, commands and events associated to it by a uuid which the microservices pass to each other.因此,单个逻辑业务请求可以通过微服务相互传递的 uuid 关联多个日志、命令和事件。

So what are some of the technologies/patterns that can be used to read the 3 topics and join them all together as a single json document and then dump them to lets say Elasticsearch?那么,有哪些技术/模式可用于阅读这 3 个主题并将它们作为一个 json 文档连接在一起,然后将它们转储到 Elasticsearch?

Streaming?流媒体?

You can use Kafka Streams, or KSQL, to achieve this.您可以使用 Kafka Streams 或 KSQL 来实现这一点。 Which one depends on your preference/experience with Java, and also the specifics of the joins you want to do.哪一个取决于您对 Java 的偏好/经验,以及您想要执行的连接的具体细节。

KSQL is the SQL streaming engine for Apache Kafka, and with SQL alone you can declare stream processing applications against Kafka topics. KSQL是 Apache Kafka 的 SQL 流引擎,仅使用 SQL 就可以针对 Kafka 主题声明流处理应用程序。 You can filter, enrich, and aggregate topics.您可以过滤、丰富和聚合主题。 Currently only stream-table joins are supported.目前仅支持流表连接。 You can see an example in this article here您可以在此处查看本文中的示例

The Kafka Streams API is part of Apache Kafka, and a Java library that you can use to do stream processing of data in Apache Kafka. Kafka Streams API是 Apache Kafka 的一部分,是一个 Java 库,可用于在 Apache Kafka 中对数据进行流处理。 It is actually what KSQL is built on, and supports greater flexibility of processing, including stream-stream joins .它实际上是 KSQL 的基础,支持更大的处理灵活性,包括流-流连接

You can use KSQL to join the streams.您可以使用 KSQL 加入流。

  1. There are 2 constructs in KSQL Table/Stream. KSQL 表/流中有 2 个构造。
  2. Currently, the Join is supported for a Stream & a table.目前,Join 支持 Stream 和 table。 So you need to identify the which is a good fit for what?所以你需要确定哪个适合什么?
  3. You don't need windowing for joins.您不需要为连接开窗。

Benefits of using KSQL.使用 KSQL 的好处。

  1. KSQL is easy to set up. KSQL 很容易设置。
  2. KSQL is SQL language which helps you to query your data quickly. KSQL 是一种 SQL 语言,可帮助您快速查询数据。

Drawback.退税。

  1. It's not production ready but in April-2018 the release is coming up.它尚未准备好生产,但将在 2018 年 4 月发布。
  2. Its little buggy right now but certainly will improve in a few months.它现在的小马车,但肯定会在几个月内改善。

Please have a look.请看一看。

https://github.com/confluentinc/ksql https://github.com/confluentinc/ksql

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM