[英]Does flink provide checkpointing for datasets batch processing
How to configure check pointing for flink batch processing.如何为 flink 批处理配置检查点。 I'm interested in knowing how checking pointing work internally.
我有兴趣了解检查指向如何在内部工作。 Since check point happens at an interval, if the job failed before the next point, won't there be any duplicate processing if it restarts.
由于检查点是间隔发生的,因此如果作业在下一个点之前失败,则重新启动时不会有任何重复处理。 Does flink check points for each operator, sink and sources?
flink 是否为每个操作员、接收器和源检查点?
Flink does not support checkpointing on the DataSet API. Flink 不支持对 DataSet API 进行检查点。
You can use checkpointing in DataStream with finite sources though, which covers most of the DataSet API use cases already.不过,您可以在具有有限源的 DataStream 中使用检查点,它已经涵盖了大部分 DataSet API 用例。 The long-term vision is to completely replace the DataSet API with DataStream + finite sources, such that users do not need to write two programs if they want to analyze a stream or batch.
长期愿景是用DataStream +有限源完全取代DataSet API,这样用户就不需要编写两个程序来分析stream或批处理。
With Table API and SQL, this goal is already pretty near.有了表 API 和 SQL,这个目标已经很接近了。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.