简体   繁体   中英

Can cloudera impala make use of task nodes in EMR?

I have been experimenting with Impala on EMR, and it seems to me that it only makes use of the core nodes in the cluster, not the task nodes.

I am using the inbuilt Impala install provided by EMR, which is 1.2.4. When I have task nodes in my cluster, they appear in the 'Known backends' list in the Impalad admin app. However on the 'queries' page, under 'Query locations', it only ever shows the hostnames of the core nodes in my cluster, not the task nodes. This suggests to me that the queries are only running on core nodes. Perhaps it is because HDFS is only on the core nodes?

Can anyone confirm this? And if so, is there a way to get it to use them?

Cheers Tom

Impala will only run the queries on Core nodes (datanodes) as each Impala process reads/writes directly to the local HDFS storage. This is one of the ways Impala makes performance gains.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM