简体   繁体   English

有关hadoop中的map-reduce执行的查询

[英]Queries regarding map-reduce execution in hadoop

Assume the data is not present in its node and present in some other machine, 假设数据不在其节点中,而在其他计算机中,

  • How will the task tracker know which node contains data? 任务跟踪器将如何知道哪个节点包含数据?

  • Does it talk to that data node directly? 它直接与该数据节点通信吗? Or it will contact its own data node and it will take that responsibilty to copy that data? 还是它将与自己的数据节点联系,并承担复制数据的责任?

How will the task tracker know which node contains data? 任务跟踪器将如何知道哪个节点包含数据?

The TaskTracker does not know it. TaskTracker不知道。 The JobTracker contacts the Namenode, gets the locations of the data, and tries its best to allocate data from one node to a TaskTracker on the same node (or as close as possible). JobTracker与Namenode联系,获取数据位置,并尽最大努力将数据从一个节点分配到同一节点上(或尽可能靠近)的TaskTracker。

Does it talk to that data node directly? 它直接与该数据节点通信吗? Or it will contact its own data node and it will take that responsibilty to copy that data? 还是它将与自己的数据节点联系,并承担复制数据的责任?

It talks to the Datanode directly. 它直接与Datanode对话。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM