简体   繁体   中英

Stream reading from database using spark streaming

I want to use spark streaming to read data from RDBMS database like mysql.

but I don't know how to do this using JavaStreamingContext

 JavaStreamingContext jssc = new JavaStreamingContext(conf, Durations.milliseconds(500));
DataFrame df = jssc. ??

I search in the internet but I didn't find anything

thank you in advance.

You cannot do it like that without installing some third party piece of software.
What you CAN do is creating a personalized receiver which does what you want, using the SparkSQL package and the Streaming one combined.
Implement a class extending Receiver and inside do all the connections and querys needed to pull the data from the DB.
I am at work now, so I'll give you a link to see instead of producing the code, sorry:
http://spark.apache.org/docs/latest/streaming-custom-receivers.html
https://medium.com/@anicolaspp/spark-custom-streaming-sources-e7d52da72e80

The best possible and reliable solution would be avoid using MySqL at all. when you insert your records to MySQl put them also into Kafka (Kafka producer) by a transaction and then use them in your streaming application.

It's not possible to stream from MySql I think. Data can be ingested from many sources like Kafka, Flume, Twitter, ZeroMQ, Kinesis, or TCP sockets.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM