简体   繁体   English

在Kafka Streams作业中执行数据库同步查询或静态调用是一种好习惯吗?

[英]Is it a good practice to do sync database query or restful call in Kafka streams jobs?

I use Kafka streams to process real-time data, in the Kafka streams tasks, I need to access MySQL to query data, and need to call another restful service. 我使用Kafka流来处理实时数据,在Kafka流任务中,我需要访问MySQL以查询数据,并且需要调用另一个Restful服务。

All the operations are synchronous. 所有操作都是同步的。

I'm afraid the sync call will reduce the process capability of the streams tasks. 恐怕同步调用会降低流任务的处理能力。

Is this a good practice? 这是一个好习惯吗? or Is there any good idea to do this? 或有什么好主意吗?

A better way to do it would be to stream your MySQL table(s) into Kafka, and access the data there. 更好的方法是将您的MySQL表流式传输到Kafka中,然后在其中访问数据。 This has the advantage of decoupling your streams app from the MySQL database. 这具有将流应用程序与MySQL数据库解耦的优势。 If you moved away from MySQL in the future, so long as the data were still written to the Kafka topic from wherever it subsequently lived, your streams app would be unaffected. 如果您将来不再使用MySQL,只要仍将数据从后来居住的地方写入Kafka主题,您的流应用就不会受到影响。 If it's just configurations you're storing in MySQL, you could even adopt the pattern that some people use of using Kafka as the primary store for data (using log compaction, to retain it forever). 如果只是存储在MySQL中的配置,您甚至可以采用某些人使用的模式,即使用Kafka作为数据的主要存储(使用日志压缩,以永久保存它)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM