简体   繁体   中英

Does flink provide checkpointing for datasets batch processing

How to configure check pointing for flink batch processing. I'm interested in knowing how checking pointing work internally. Since check point happens at an interval, if the job failed before the next point, won't there be any duplicate processing if it restarts. Does flink check points for each operator, sink and sources?

Flink does not support checkpointing on the DataSet API.

You can use checkpointing in DataStream with finite sources though, which covers most of the DataSet API use cases already. The long-term vision is to completely replace the DataSet API with DataStream + finite sources, such that users do not need to write two programs if they want to analyze a stream or batch.

With Table API and SQL, this goal is already pretty near.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM