简体繁体 English

有关hadoop中的map-reduce执行的查询

[英]Queries regarding map-reduce execution in hadoop

原文 2012-06-01 19:47:18 8 1 hadoop

Assume the data is not present in its node and present in some other machine, 假设数据不在其节点中，而在其他计算机中，

How will the task tracker know which node contains data? 任务跟踪器将如何知道哪个节点包含数据？
Does it talk to that data node directly? 它直接与该数据节点通信吗？ Or it will contact its own data node and it will take that responsibilty to copy that data? 还是它将与自己的数据节点联系，并承担复制数据的责任？

1 个解决方案

How will the task tracker know which node contains data? 任务跟踪器将如何知道哪个节点包含数据？

The TaskTracker does not know it. TaskTracker不知道。 The JobTracker contacts the Namenode, gets the locations of the data, and tries its best to allocate data from one node to a TaskTracker on the same node (or as close as possible). JobTracker与Namenode联系，获取数据位置，并尽最大努力将数据从一个节点分配到同一节点上（或尽可能靠近）的TaskTracker。

Does it talk to that data node directly? 它直接与该数据节点通信吗？ Or it will contact its own data node and it will take that responsibilty to copy that data? 还是它将与自己的数据节点联系，并承担复制数据的责任？

It talks to the Datanode directly. 它直接与Datanode对话。