简体繁体 English

Kafka Streams用例

[英]Kafka Streams use case

原文 2017-04-19 19:49:00 2 3 apache-kafka/ apache-kafka-streams

I am building a simple application which does below in order - 我正在构建一个简单的应用程序，按顺序执行 -

1) Reads messages from a remote IBM MQ(legacy system only works with IBM MQ) 1）从远程IBM MQ读取消息（遗留系统仅适用于IBM MQ）

2) Writes these messages to Kafka Topic 2）将这些消息写入Kafka主题

3) Reads these messages from the same Kafka Topic and calls a REST API. 3）从同一个Kafka主题中读取这些消息并调用REST API。

4) There could be other consumers reading from this topic in future. 4）未来可能会有其他消费者从这个主题中读到。

I came to know that Kafka has the new streams API which is supposed to be better than Kafka consumer in terms of speed/simplicity etc. Can someone please let me know if the streams API is a good fit for my use case and at what point in my process i can plug it ? 我开始知道Kafka有新的流API，在速度/简单性等方面应该比Kafka消费者更好。有人可以告诉我，如果流API非常适合我的用例以及在什么时候在我的过程中我可以插上吗？

3 个解决方案

1) Reads messages from a remote IBM MQ (legacy system only works with IBM MQ) 1）从远程IBM MQ读取消息（遗留系统仅适用于IBM MQ）

2) Writes these messages to Kafka Topic 2）将这些消息写入Kafka主题

I'd use Kafka's Connect API for (1) and (2). 我将（1）和（2）使用Kafka的Connect API。

3) Reads these messages from the same Kafka Topic and calls a REST API. 3）从同一个Kafka主题中读取这些消息并调用REST API。

You can use the Streams API as well as the lower-level Consumer API of Kafka, depending on what you'd prefer. 您可以使用Streams API以及Kafka的低级Consumer API，具体取决于您的喜好。

4) There could be other consumers reading from this topic in future. 4）未来可能会有其他消费者从这个主题中读到。

This works out-of-the-box -- once data is stored in a Kafka topic according to step 2, many different applications and "consumers" can read this data independently. 这是开箱即用的 - 一旦根据步骤2将数据存储在Kafka主题中，许多不同的应用程序和“消费者”可以独立地读取这些数据。

It is true that Kafka Streams API has a simple way to consume records in comparison to Kafka Consumer API (eg you don't need to poll, manage a thread and loop), but it also comes with a cost (eg local data store - if you do stateful processing). 确实，与Kafka Consumer API相比，Kafka Streams API有一种消费记录的简单方法（例如，您不需要轮询，管理线程和循环），但它还带有成本（例如本地数据存储 - 如果你做有状态的处理）。

I would say that if you need to consume records one by one and call a REST API use the Consumer API, if you need stateful processing, query the topic state, etc. use the Streams API. 我想说如果您需要逐个使用记录并调用REST API，请使用Consumer API，如果您需要有状态处理，请查询主题状态等，请使用Streams API。

For more info take a look to this blog post: https://balamaci.ro/kafka-streams-for-stream-processing/ 欲了解更多信息，请访问以下博客文章： https ： //balamaci.ro/kafka-streams-for-stream-processing/

Looks like you are not doing any processing/transformation once you consume you message from your IBM MQ or even after your Kafka Topic. 一旦您从IBM MQ或甚至在Kafka主题之后使用消息，您似乎没有进行任何处理/转换。

First one -> from IBM Mq to your Kafka Topic is kind of a pipeline and Secondly -> You are just calling the REST API(I assume w/o any processing) 第一个 - >从IBM Mq到你的Kafka Topic是一种管道，其次 - >你只是调用REST API（我假设没有任何处理）

Considering these facts it seems to be a good fit for using Simple consumer. 考虑到这些事实，它似乎非常适合使用简单的消费者。

Let's not use a technology only because it's there :) 我们不要只使用技术因为它在那里:)