简体   繁体   中英

What is the use of Hive's LLAP when there is Hive TEZ?

In our project, we load the data from Greenplum database to HDFS (HIVE). Lately, I came to know that there is a new bundle with Hive2, 'LLAP'. I have been confused with the concept of LLAP. What is the exact use of LLAP ? When we already have Hive's TEZ Engine, what is the use of LLAP ? A developer in our project told me that we are using Hive LLAP to load the data into HDFS Hive tables. Is it a good practice to use LLAP ? If not, why is it not ?

Could anyone give me some clarity on the above queries ?

https://cwiki.apache.org/confluence/display/Hive/LLAP is a good place to learn about Hive Live Long And Process (LLAP).

As the link says

LLAP works within existing, process-based Hive execution to preserve the scalability and versatility of Hive. It does not replace the existing execution model but rather enhances it.

and

LLAP is not an execution engine (like MapReduce or Tez)

Rather, it provides a long-lived daemon (hence the LL part of the acronym) to replace interactions with the DataNode, and this daemon also provides caching, pre-fetching, and some query processing. This allows simple queries to be largely processed by the daemon itself, with more complex queries being performed in YARN containers as usual.

The link also shows how Tez AM can sit above all of this, and submit Hive tasks which operate via LLAP, which interacts with the DataNode as required. In the example, initial stages of the query are pushed into LLAP, but large shuffles are performed in separate containers.

LLAP nodes are additional layer of nodes ( One LLAP node for one Hadoop Data node) between Tez and Hadoop data node that can cache data and process some queries. Query execution is still scheduled and managed by Tez.

LLAP node have daemons that cache data which can accelerate queries if common data is accessed again and again.

In short it boost performance, you will get very good performance for your queries using LLAP in hive. Hive can also work without LLAP as well but it can be slower.

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM