简体   繁体   中英

java.lang.NoClassDefFoundError: org/apache/spark/sql/SparkSession

I have written a Spark Job in Java. When I submit the Job it gives below error:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/SparkSession
        at com.thinkbiganalytics.veon.util.SparkSessionBuilder.getOrCreateSparkSession(SparkSessionBuilder.java:12)
        at com.thinkbiganalytics.veon.AbstractSparkTransformation.initSparkSession(AbstractSparkTransformation.java:92)
        at com.thinkbiganalytics.veon.transformations.SDPServiceFeeDeductionSourceToEventStore.init(SDPServiceFeeDeductionSourceToEventStore.java:57)
        at com.thinkbiganalytics.veon.AbstractSparkTransformation.doTransform(AbstractSparkTransformation.java:51)
        at com.thinkbiganalytics.veon.transformations.SDPServiceFeeDeductionSourceToEventStore.main(SDPServiceFeeDeductionSourceToEventStore.java:51)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:745)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.SparkSession
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

如果您从 Intellij IDEA 内部运行,并且您已将 spark 库标记为“已提供”,如下所示: "org.apache.spark" %% "spark-sql" % "3.0.1" % "provided" , 然后您需要编辑您的运行/调试配置并选中“包含具有提供范围的依赖项”框。

I was facing this issue while running from the Intellij editor. I had marked the spark jars as provided in pom.xml , see below:

<dependency>
     <groupId>org.apache.spark</groupId>
     <artifactId>spark-sql_2.11</artifactId>
     <version>2.4.0</version>
     <scope>provided</scope>
 </dependency>

On removing the provided scope, the error was gone.

On making provided spark jars they would be provided only on running the application with spark-submit or having the spark jars on the classpath

when submitting with spark-submit , check that your project has the same dependency as spark version in pom.xml,

This may be because you have two spark versions on the same machine


If you want to have different Spark installations on your machine, you can create different soft links and can use the exact spark version on which you have build your project

spark1-submit -> /Users/test/sparks/spark-1.6.2-bin-hadoop2.6/bin/spark-submit

spark2–submit -> /Users/test/sparks/spark-2.1.1-bin-hadoop2.7/bin/spark-submit

Here is a link from Cloudera blog about multiple Spark versions https://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Multiple-Spark-version-on-the-same-cluster/td-p/39880

Probably you are deploying your application on the cluster with lower Spark version.

Please check Spark version on your cluster - it should be the same as version in pom.xml. Please also note, that all Spark dependencies should be marked as provided when you use spark-submit to deploy application

As per the exception you are getting ,I think required jar is missing you need to add the required jar in your classpath which will resolve the issue.

refer this link to download the required jar

如果您使用 Intellij IDEA,spark 环境的 jar 泄漏将导致此问题,您可以按照以下步骤操作:文件 -> 项目结构 -> 模块 -> spark-examples_2.11 -> 依赖 jars -> {spark dir} /spark/assembly/target/scala-2.11/jars/

If using Maven, go to your dependencies file( pom.xml ) and change the scope from provided to compile .

<dependency>
   <groupId>org.apache.spark</groupId>
   <artifactId>spark-sql_2.13</artifactId>
   <version>3.3.0</version>
   <scope>compile</scope>
</dependency>

I am using Maven project, got the same issue, I changed "provided" to "compile" in scope, pom.xml dependencies and made the same spark version which was installed in local spark shell. Problem got resolved.

If you are running from IntelliJ, please check for "Include dependencies with Provided scope" as follows

Inside Run/Debug Configuration please select for Modify Options and then mark checked "Include dependencies with Provided scope"

在运行/调试配置中,请选择修改选项,然后检查

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM