简体   繁体   English

使用 Java 和 Spark Eclipse 连接 Dataproc Hive 服务器时出现异常

[英]Exception in Connecting Dataproc Hive Server using Java and Spark Eclipse

I am trying to access the Hive server present in GCP - Dataproc from my local machine(eclipse) using java and spark.我正在尝试使用 java 和 spark 从我的本地机器(eclipse)访问 GCP 中存在的 Hive 服务器 - Dataproc。 But I am getting the below error while starting the application.但是我在启动应用程序时收到以下错误。 I tried to find the problem but unable to solve it.我试图找到问题,但无法解决。

Exception in thread "main" java.lang.IllegalArgumentException: Unable to instantiate SparkSession with Hive support because Hive classes are not found.线程“main”中的异常 java.lang.IllegalArgumentException:无法使用 Hive 支持实例化 SparkSession,因为找不到 Hive 类。

at org.apache.spark.sql.SparkSession$Builder.enableHiveSupport(SparkSession.scala:870) at com.hadoop.Application.main(Application.java:22)在 org.apache.spark.sql.SparkSession$Builder.enableHiveSupport(SparkSession.scala:870) 在 com.hadoop.Application.main(Application.java:22)

Pom.xml: pom.xml:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.5.1</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>com.hadoop</groupId>
    <artifactId>hadoop</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>hadoop</name>
    <description>Demo project for Spring Boot</description>
    <properties>
        <java.version>1.8</java.version>
    </properties>
    <dependencyManagement>
      <dependencies>
        <dependency>
          <groupId>com.google.cloud</groupId>
          <artifactId>libraries-bom</artifactId>
          <version>20.6.0</version>
          <type>pom</type>
          <scope>import</scope>
        </dependency>
      </dependencies>
    </dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
        <dependency>
          <groupId>com.google.cloud</groupId>
          <artifactId>google-cloud-dataproc</artifactId>
          <version>1.5.2</version>
        </dependency>
        <dependency>
            <groupId>com.google.cloud</groupId>
            <artifactId>google-cloud-storage</artifactId>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>2.4.7</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.11</artifactId>
            <version>2.4.7</version>
            <scope>provided</scope>
            <exclusions>
                <exclusion>
                    <groupId>io.netty</groupId>
                    <artifactId>netty-all</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>io.netty</groupId>
            <artifactId>netty-all</artifactId>
            <version>4.1.47.Final</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>2.10.1</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-hive_2.11</artifactId>
            <version>2.4.7</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>com.sun.jersey</groupId>
            <artifactId>jersey-client</artifactId>
            <version>1.9</version>
        </dependency>   
        <dependency>
            <groupId>org.objenesis</groupId>
            <artifactId>objenesis</artifactId>
            <version>2.5.1</version>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>
</project>

The problem is with the scope of the following dependency:问题在于以下依赖项的范围:

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-hive_2.11</artifactId>
            <version>2.4.7</version>
            <scope>provided</scope>
        </dependency>

According to the Maven doc :根据 Maven 文档

compile : This is the default scope, used if none is specified. compile :这是默认范围,如果没有指定则使用。 Compile dependencies are available in all classpaths of a project.编译依赖项在项目的所有类路径中都可用。 Furthermore, those dependencies are propagated to dependent projects.此外,这些依赖关系会传播到依赖项目。

provided : This is much like compile, but indicates you expect the JDK or a container to provide the dependency at runtime.提供:这很像编译,但表明您希望 JDK 或容器在运行时提供依赖项。 For example, when building a web application for the Java Enterprise Edition, you would set the dependency on the Servlet API and related Java EE APIs to scope provided because the web container provides those classes.例如,在为 Java Enterprise Edition 构建 Web 应用程序时,您可以将对 Servlet API 和相关 Java EE API 的依赖项设置为提供的范围,因为 Web 容器提供这些类。 A dependency with this scope is added to the classpath used for compilation and test, but not the runtime classpath.具有此范围的依赖项会添加到用于编译和测试的类路径中,但不会添加到运行时类路径中。 It is not transitive.它不是传递性的。

You might want to change it to compile or remove the line.您可能希望更改它以compile或删除该行。 Or download the jar and add it to the classpath.或者下载 jar 并将其添加到类路径中。

Also see this doc on how to create a Spark application uber jar which includes its dependencies.另请参阅此文档,了解如何创建包含其依赖项的 Spark 应用程序 uber jar。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM