简体   繁体   English

配置单元查询从边缘节点运行

[英]Hive queries run from edge nodes

Are there any disadvantages running hive insert queries from edge node against running it from oozie workflows? 与从oozie工作流程运行它相比,从边缘节点运行配置单元insert查询有什么缺点吗?

Oozie docs says that running through oozie will distribute the workload to datanodes which are available. Oozie文档说,通过oozie运行会将工作负载分配到可用的数据节点。

But I was thinking running through edge node should still call job tracker and run in on cluster? 但是我想通过边缘节点运行仍应调用作业跟踪程序并在集群上运行?

When you run a Hive command from the edge node, it takes that command, creates machine-generated MapReduce code (in most cases), and sends that over to the cluster, where it is treated like any MapReduce job using as many data nodes as needed. 当您从边缘节点运行Hive命令时,它将接受该命令,创建机器生成的MapReduce代码(在大多数情况下),然后将其发送到集群,在该集群中,它被视为与使用尽可能多的数据节点的任何MapReduce作业一样需要。 Oozie would do the same thing. Oozie会做同样的事情。 Either way. 无论哪种方式。

So your assumption there is correct. 因此,您的假设是正确的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM