简体   繁体   English

Hadoop生态系统:Pig / Hive所需的Map Reduce

[英]Hadoop Ecosystem: Map Reduce needed for Pig/Hive

There is a whole lot of hadoop ecosystem pictures on the internet, so i struggle to get an understanding how the tools work together. 互联网上有很多Hadoop生态系统图片,因此我很难理解这些工具如何协同工作。

Eg in the picture attached, why are pig and hive based on map reduce whereas the other tools like spark or storm on YARN? 例如,在所附图片中,为什么基于地图的猪和蜂巢会减少,而其他工具(例如YARN上的火花或风暴)会减少?

Would you be so kind and explain this? 您会这​​么友善并解释一下吗?

Thanks! 谢谢! BR BR

haddop ecosystem haddop生态系统

The picture shows Pig and Hive on top of MapReduce . 该图在MapReduce上显示PigHive This is because MapReduce is a distributed computing engine that is used by Pig and Hive . 这是因为MapReducePigHive使用的分布式计算引擎。 Pig and Hive queries get executed as MapReduce jobs. PigHive查询将作为MapReduce作业执行。 It is easier to work with Pig and Hive , since they give a higher-level abstraction to work with MapReduce . 使用PigHive更加容易,因为它们为使用MapReduce提供了更高层次的抽象。

Now let's take a look at Spark / Storm / Flink on YARN in the picture. 现在,让我们看一下图片中YARN上的Spark / Storm / Flink YARN is a cluster manager that allows various applications to run on top of it. YARN是一个集群管理器,它允许各种应用程序在其上运行。 Storm , Spark and Flink are all examples of applications that can run on top of YARN . StormSparkFlink都是可以在YARN之上运行的应用程序的示例。 MapReduce is also considered as an application that can run on YARN , as shown in the diagram. 如图所示, MapReduce也被视为可以在YARN运行的应用程序。 YARN handles the resource management piece so that multiple applications can share the same cluster. YARN处理资源管理部分,以便多个应用程序可以共享同一群集。 (If you are interested in another example of a similar technology, check out Mesos ). (如果您对类似技术的另一个示例感兴趣,请Mesos )。

Finally, at the bottom of the picture is HDFS . 最后,在图片的底部是HDFS This is the distributed storage layer that allows applications to store and access data. 这是允许应用程序存储和访问数据的分布式存储层。 It provides features such as distributed storage, replication and fault tolerance. 它提供了分布式存储,复制和容错等功能。

If you are interested in deeper-dives, check out the Apache Projects page. 如果您对更深层次感兴趣,请查看Apache Projects页面。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM