简体   繁体   English

如何配置Hive在Google Dataproc上使用Spark执行引擎?

[英]How to configure Hive to use Spark execution engine on Google Dataproc?

I'm trying to configure Hive, running on Google Dataproc image v1.1 (so Hive 2.1.0 and Spark 2.0.2), to use Spark as an execution engine instead of the default MapReduce one. 我正在尝试配置在Google Dataproc映像v1.1(因此Hive 2.1.0和Spark 2.0.2)上运行的Hive,以将Spark用作执行引擎,而不是默认的MapReduce引擎。

Following the instructions here https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started doesn't really help, I keep getting Error running query: java.lang.NoClassDefFoundError: scala/collection/Iterable errors when I set hive.execution.engine=spark . 按照此处的说明https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started并没有真正的帮助,我一直在Error running query: java.lang.NoClassDefFoundError: scala/collection/Iterable我设置hive.execution.engine=spark时出现Error running query: java.lang.NoClassDefFoundError: scala/collection/Iterable错误。

Does anyone know the specific steps to get this running on Dataproc? 有谁知道在Dataproc上运行它的具体步骤? From what I can tell it should just be a question of making Hive see the right JARs, since both Hive and Spark are already installed and configured on the cluster, and using Hive from Spark (so the other way around) works fine. 据我所知,这应该是使Hive看到正确的JAR的问题,因为Hive和Spark均已在群集上安装和配置,并且使用Spark的Hive(反之亦然)也可以正常工作。

This will probably not work with the jars in a Dataproc cluster. 这可能不适用于Dataproc集群中的jar。 In Dataproc, Spark is compiled with Hive bundled (-Phive), which is not suggested / supported by Hive on Spark. 在Dataproc中,Spark使用Hive捆绑(-Phive)进行编译,Hive on Spark不建议/不支持。

If you really want to run Hive on Spark, you might want to try to bring your own Spark in an initialization action compiled as described in the wiki . 如果您真的想在Spark上运行Hive,则可能需要尝试将自己的Spark引入如Wiki中所述编译的初始化操作中。

If you just want to run Hive off MapReduce on Dataproc running Tez, with this initialization action would probably be easier. 如果只想在运行Tez的Dataproc上从MapReduce上运行Hive,则使用此初始化操作可能会更容易。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM