简体   繁体   English

何时使用Hive引擎MR和何时使用TEZ?

[英]When to use Hive engine MR and when to use TEZ?

Under what conditions is it preferable to use the Hive engine TEZ over MR? 在什么条件下使用Hive引擎TEZ而不是MR?

What are the pro's and con's of each? 各自的优缺点是什么?

TEZ does the same as MR does only faster. TEZ与MR一样,只是速度更快。 The more complex the query is the more benefit from TEZ. 查询越复杂,从TEZ中获得​​的好处就越大。 So TEZ is always preferable when it works. 因此,TEZ在工作时始终是首选。

Tez generalizes the MapReduce paradigm to a more powerful framework by providing the ability to execute a complex DAG (directed acyclic graph) of tasks for a single job. 通过提供针对单个作业执行复杂的任务DAG(有向无环图)的功能,Tez将MapReduce范式概括为一个更强大的框架。 When the plan is implemented via map-reduce primitives, there are an inevitable number of job boundaries which introduce overheads of read/write to durable storage and job startup, and which may miss out on easy optimization opportunities such as worker node reuse and warm caches. 当通过map-reduce原语实现计划时,不可避免地会有许多作业边界,这些作业边界会给持久存储和作业启动带来读/写开销,并且可能会错过容易的优化机会,例如工作程序节点重用和高速缓存。

Of course there are some bugs not resolved yet in your TEZ version - this is the only problem you may face implementing some particular solution on TEZ. 当然,您的TEZ版本中还存在一些尚未解决的错误-这是在TEZ上实现某些特定解决方案时可能会遇到的唯一问题。

Though MR is more mature but Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. 尽管MR更成熟,但是Hive-on-MR在Hive 2中已弃用,并且在将来的版本中可能不可用。

Read also this: 另请阅读:

Difference between MR and Tez MR和Tez之间的区别

and this: 和这个:

Introducing Tez 介绍Tez

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM