简体   繁体   English

在Spark上进行配置时,在Spark Master Web UI中的作业应用程序中总是错误的executor_cores

[英]hive on spark, always wrong executor_cores in job application from spark master web UI

I am trying to switch hive 2.1.1 on mapreduce to hive on spark. 我正在尝试将mapreduce上的hive 2.1.1切换为spark上的hive。 As told in hive on spark official site, i build a spark 1.6.0 (as to spark rev in hive 2.1.1 source code POM) without hive. 正如在Hive官方网站上的Hive中所说的那样,我没有spark 1.6.0构建了一个spark 1.6.0 (关于Hive 2.1.1源代码POM中的Spark Rev)。 The Spark is working fine with a spark- submit/spark-shell test. Spark通过submit/spark-shell测试工作正常。 I set the 我设定

spark.executor.cores/spark.executor.memory spark.executor.cores / spark.executor.memory

in hive-site.xml , also limit these 2 by hive-site.xml ,还将这两个限制为

SPARK_WORKER_CORES/SPARK_WORKER_MEMORY SPARK_WORKER_CORES / SPARK_WORKER_MEMORY

in spark-env.sh . spark-env.sh But after i start a hive query like select count(*) from hive cli, the job in spark master web UI is always with 0 CPU cores applied, so the job is not executed and hive query waits like for ever in cli. 但是在我从hive cli中启动诸如select count(*)类的hive查询后,spark master Web UI中的作业始终应用了0个CPU内核,因此该作业没有执行,并且hive查询像在cli中一样一直等待。 And spark cluster is set up on a docker environment on that each server is a docker container running on a server with added up to 160 cores/160g memory . 并且spark集群是在docker环境中设置的,每个服务器都是在服务器上运行的docker容器,最多可增加160个核心/ 160g内存 Before i set SPARK_WORKER_CORES/SPARK_WORKER_MEMORY , always 156 cores are applied which also leads to failure without enough resource. 在我设置SPARK_WORKER_CORES / SPARK_WORKER_MEMORY之前,始终应用156个内核,这也会导致失败而导致资源不足。 After i set SPARK_WORKER_CORES/SPARK_WORKER_MEMORY limitted to resource assigned to the docker container, 0 is applied. 在我将SPARK_WORKER_CORES / SPARK_WORKER_MEMORY设置为仅限分配给Docker容器的资源后,将应用0。

i have been stuck on the problem 2 days without progress. 我被困在问题上2天没有进展。 hope some tips from anyone who is familiar with hive on docker or run hive/spark on a docker env. 希望从熟悉docker上的hive或在docker env上运行hive / spark的任何人那里获得一些提示。

I dont think spark execution engine works well with hive at all. 我认为Spark执行引擎根本不能与Hive一起使用。 The hive version you are trying integrate with spark , is built with spark 2.0.0 and not 1.6.0 There has been lot of discussion on this before . 您正在尝试与spark集成的配置单元版本是由spark 2.0.0而非1.6.0构建的 。以前对此进行了很多讨论。 See the thread here You are better off with using Tez as many of of users reports on that thread. 在此处查看线​​程 。使用Tez更好,因为许多用户对该线程进行报告。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM