简体   繁体   中英

Hive on Spark CDH 5.7 - Failed to create spark client

We are getting the error while executing the hive queries with spark engine.

Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

The following properties are set to use spark as the execution engine instead of mapreduce:

set hive.execution.engine=spark;
set spark.executor.memory=2g;

I tried changing the following properties also.

 set yarn.scheduler.maximum-allocation-mb=2048;
    set yarn.nodemanager.resource.memory-mb=2048;
    set spark.executor.cores=4;
    set spark.executor.memory=4g;
    set spark.yarn.executor.memoryOverhead=750;
    set hive.spark.client.server.connect.timeout=900000ms;

Do I need to set some other properties? Can anyone suggest?

Seems like e YARN Container Memory was smaller than the Spark Executor requirement. Please set the YARN Container memory and maximum to be greater than Spark Executor Memory + Overhead.

  1. yarn.scheduler.maximum-allocation-mb
  2. yarn.nodemanager.resource.memory-mb

yarn.nodemanager.resource.memory-mb:

Amount of physical memory, in MB, that can be allocated for containers. It means the amount of memory YARN can utilize on this node and therefore this property should be lower then the total memory of that machine.

<name>yarn.nodemanager.resource.memory-mb</name>
<value>40960</value> <!-- 40 GB -->

The next step is to provide YARN guidance on how to break up the total resources available into Containers. You do this by specifying the minimum unit of RAM to allocate for a Container.

In yarn-site.xml

<name>yarn.scheduler.minimum-allocation-mb</name> <!-- RAM-per-container ->
 <value>2048</value>

yarn.scheduler.maximum-allocation-mb:

It defines the maximum memory allocation available for a container in MB

it means RM can only allocate memory to containers in increments of "yarn.scheduler.minimum-allocation-mb" and not exceed "yarn.scheduler.maximum-allocation-mb" and It should not be more then total allocated memory of the Node.

In yarn-site.xml

<name>yarn.scheduler.maximum-allocation-mb</name> <!-Max RAM-per-container->
 <value>8192</value>

Also go to Spark History Server: goto Spark on YARN service Instance > History Server > History Service WebUI > Click on relevant job > Click on relevant Failed Job > Click on failed stages for that job and look for the "details" section.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM