简体   繁体   中英

What is the Use of setting Interval for checkpoint in spark streaming?

**1)I want to know about the use of check point interval ???

2)is there any relationship checkpoint interval with window size or sliding interval or batch interval ????

3)while running spark streaming in standalone cluster mode ,task(or work) is distributed to worker nodes by the master???

Here i am performing streaming operation ,by reading list of files from my file system always reading files task is done by only one worker it is not shared to all the workers ???(currently i have two workers )

Thanks for your help>>!**

1,2) Check point interval is generally about 5-7 times the data set interval time. This is considered good. http://blog.cloudera.com/blog/2014/03/a-guide-to-checkpointing-in-hadoop/

3) Yes

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM