简体   繁体   中英

What mechanism that Giraph's workers do when receiving messages in vertices?

I am curious, in Giraph's worker API documentation, I see an explanation about this method:

public void storeCheckpoint()
// Both the vertices and the messages need to be checkpointed in order for them to be used. 
// This is done after all messages have been delivered, but prior to a superstep starting.

I know that they use their accepted messages in the compute() method, but when do they receive it? If it is before the checkpoint process, is there any part in the documentation/code that I can see to understand it?

Also, what mechanism that Giraph use to store messages before superstep S+1? Are they store it in a buffer or disk first?

I find nothing in the Giraph documentation about this.

All the messages are received after a superstep in bulk, which tells that on which vertex or node the compute function should execute in the next superstep. This is the process of Bulk Synchronous parallel. In this process ever vertex for which the meesage has been delivered becomes active and compute method is parallelly executed on each of those vertices. This is the superstep. Now this process repeats till all the vertices reach a situation known as vote to halt, this vote to halt uses apache zookeeper zk node or you can say a function which is writeHaltInstrcutions(args,args) to halt the running process. Remember you need that storecheckpoint function as sometimes what happens is that after Map is 100% executed and there is no reducer. The process just keeps on running and there is no halting, so for those situations you need to have a checkpoint function which keeps track of the checkpoints you have reached. I hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM