简体   繁体   中英

Java Flink : Add Source as Stream and Sink in Batches

I am new to Flink. And I have a requirement where in I need to read data continuously from a Kafka Stream but write it in Batches. So as to reduce the number of queries on MongoServer.

Please guide me the best way to do it.

What I tried to currently.

  • Read data from Kafka Source
  • Apply time window of 5 minutes
  • Reduce the entries to create a list of entries.
  • Read the list from the MongoSink function do a BulkWrite

Thanks, ashnik

The above seems like it should work. Since the Mongo client is pretty simple, if you wanted to be more efficient, you could implement your own stateful ProcessFunction that keeps a list of entries, and flushes to MongoDB when the list hits a certain size or sufficient time has elapsed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM