简体   繁体   中英

Queries regarding map-reduce execution in hadoop

Assume the data is not present in its node and present in some other machine,

  • How will the task tracker know which node contains data?

  • Does it talk to that data node directly? Or it will contact its own data node and it will take that responsibilty to copy that data?

How will the task tracker know which node contains data?

The TaskTracker does not know it. The JobTracker contacts the Namenode, gets the locations of the data, and tries its best to allocate data from one node to a TaskTracker on the same node (or as close as possible).

Does it talk to that data node directly? Or it will contact its own data node and it will take that responsibilty to copy that data?

It talks to the Datanode directly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM