简体   繁体   中英

MongoDB as datasource to Flink

Can MongoDB be used as a datasource to Apache Flink for processing the Streaming Data?

What is the native implementation of Apache Flink to use No-SQL Database as data source?

Currently, Flink does not have a dedicated connector to read from MongoDB. What you can do is the following:

  • Use StreamExecutionEnvironment.createInput and provide a Hadoop input format for MongoDB using Flink's wrapper input format
  • Implement your own MongoDB source via implementing SourceFunction / ParallelSourceFunction

The former should give you at-least-once processing guarantees since the MongoDB collection is completely re-read in case of a recovery. Depending on the functionality of the MongoDB client, you might be able to implement exactly-once processing guarantees with the latter approach.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM