简体   繁体   English

来自 JDBC 源的 Spark 结构化流

[英]Spark structured streaming from JDBC source

Can someone let me know if its possible to to Spark structured streaming from a JDBC source?有人可以让我知道是否可以从 JDBC 源进行 Spark 结构化流式传输? Eg SQL DB or any RDBMS.例如 SQL DB 或任何 RDBMS。

I have looked at a few similar questions on SO, eg我看过一些关于 SO 的类似问题,例如

Spark streaming jdbc read the stream as and when data comes - Data source jdbc does not support streamed reading Spark streaming jdbc 在数据到来时读取流 - 数据源 jdbc 不支持流式读取

jdbc source and spark structured streaming jdbc 源和 spark 结构化流

However, I would like to know if its officially supported on Apache Spark?但是,我想知道它是否在 Apache Spark 上得到官方支持?

If there is any sample code that would be helpful.如果有任何有用的示例代码。

Thanks谢谢

No, there is no such built-in support in Spark Structured Streaming.不,Spark Structured Streaming 中没有这样的内置支持 The main reason is that most of databases doesn't provided an unified interface for obtaining the changes.主要原因是大多数数据库没有提供统一的接口来获取更改。

It's possible to get changes from some databases using archive logs, write-ahead logs, etc. But it's database-specific.可以使用归档日志、预写日志等从某些数据库中获取更改。但它是特定于数据库的。 For many databases the popular choice is Debezium that can read such logs and push list of changes into a Kafka, or something similar, from which it could be consumed by Spark.对于许多数据库来说,流行的选择是Debezium ,它可以读取此类日志并将更改列表推送到 Kafka 或类似的东西中,Spark 可以从中使用它。

I am on a project now architecting this using CDC Shareplex from ORACLE and writing to KAFKA and then using Spark Structured Streaming with KAFKA integration and MERGE on delta format on HDFS.我现在正在一个项目中使用来自 ORACLE 的 CDC Shareplex 并写入 KAFKA,然后使用 Spark Structured Streaming 与 KAFKA 集成和 MERGE 在 HDFS 上的增量格式上进行架构。

Ie that is the way to do it if not using Debezium.也就是说,如果不使用 Debezium,那就是这样做的方法。 You can use change logs for base tables or materialized views to feed CDC.您可以使用基表或物化视图的更改日志来提供 CDC。

So direct JDBC is not possible.所以直接 JDBC 是不可能的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM