简体   繁体   中英

Why does spark application fail with java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig even though the jar exists?

I am working on a Hadoop cluster that has Spark 2.3.x. For my use case, I need Spark 2.4x which I downloaded from internet and moved it to my server and extracted into a new dir: ~/john/spark247ext/spark-2.4.7-bin-hadoop2.7

This is how my Spark2.4.7 directory looks like:

username@:[~/john/spark247ext/spark-2.4.7-bin-hadoop2.7] {173} $ ls
bin  conf  data  examples  jars  kubernetes  LICENSE  licenses  NOTICE  python  R  README.md  RELEASE  sbin  yarn

These are the contents of my bin dir.

username@:[~/john/spark247ext/spark-2.4.7-bin-hadoop2.7/bin] {175} $ ls
beeline               find-spark-home.cmd  pyspark2.cmd     spark-class       sparkR2.cmd       spark-shell.cmd  spark-submit
beeline.cmd           load-spark-env.cmd   pyspark.cmd      spark-class2.cmd  sparkR.cmd        spark-sql        spark-submit2.cmd
docker-image-tool.sh  load-spark-env.sh    run-example      spark-class.cmd   spark-shell       spark-sql2.cmd   spark-submit.cmd
find-spark-home       pyspark              run-example.cmd  sparkR            spark-shell2.cmd  spark-sql.cmd

I am submitting my spark code using the below spark spark submit command:

./spark-submit --master yarn --deploy-mode cluster --driver-class-path /home/john/jars/mssql-jdbc-9.2.0.jre8.jar --jars /home/john/jars/spark-bigquery-with-dependencies_2.11-0.19.1.jar,/home/john/jars/mssql-jdbc-9.2.0.jre8.jar --driver-memory 1g --executor-memory 4g --executor-cores 4 --num-executors 4 --class com.loader /home/john/jars/HiveLoader-1.0-SNAPSHOT-jar-with-dependencies.jar somearg1 somearg2 somearg3

The job fails with exception java.lang.ClassNotFoundException:com.sun.jersey.api.client.config.ClientConfig so I added that jar to my spark-submit command as well like below.

./spark-submit --master yarn --deploy-mode cluster --driver-class-path /home/john/jars/mssql-jdbc-9.2.0.jre8.jar --jars /home/john/jars/spark-bigquery-with-dependencies_2.11-0.19.1.jar,/home/john/jars/mssql-jdbc-9.2.0.jre8.jar,/home/john/jars/jersey-client-1.19.4.jar --driver-memory 1g --executor-memory 4g --executor-cores 4 --num-executors 4 --class com.loader /home/john/jars/HiveLoader-1.0-SNAPSHOT-jar-with-dependencies.jar somearg1 somearg2 somearg3

I also checked the the directory: /john/spark247ext/spark-2.4.7-bin-hadoop2.7/jars and found out that the jar: jersey-client-x.xx.x.jar exists there.

username@:[~/john/spark247ext/spark-2.4.7-bin-hadoop2.7/jars] {179} $ ls -ltr | grep jersey
-rwxrwxrwx 1 john john   951701 Sep  8  2020 jersey-server-2.22.2.jar
-rwxrwxrwx 1 john john    72733 Sep  8  2020 jersey-media-jaxb-2.22.2.jar
-rwxrwxrwx 1 john john   971310 Sep  8  2020 jersey-guava-2.22.2.jar
-rwxrwxrwx 1 john john    66270 Sep  8  2020 jersey-container-servlet-core-2.22.2.jar
-rwxrwxrwx 1 john john    18098 Sep  8  2020 jersey-container-servlet-2.22.2.jar
-rwxrwxrwx 1 john john   698375 Sep  8  2020 jersey-common-2.22.2.jar
-rwxrwxrwx 1 john john   167421 Sep  8  2020 jersey-client-2.22.2.jar

I also added the dependency in my pom.xml file:

<dependency>
  <groupId>com.sun.jersey</groupId>
  <artifactId>jersey-client</artifactId>
  <version>1.19.4</version>
</dependency>

Even after giving the jar file in my spark-submit command and also creting a fat jar file out of my maven project which will have all dependencies, I still see the exception:

Exception in thread "main" java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig
        at org.apache.hadoop.yarn.client.api.TimelineClient.createTimelineClient(TimelineClient.java:55)
        at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createTimelineClient(YarnClientImpl.java:181)
        at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:168)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:161)
        at org.apache.spark.deploy.yarn.Client.run(Client.scala:1135)
        at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1530)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: com.sun.jersey.api.client.config.ClientConfig
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

The spark I downloaded is for my own use case so I haven't changed any settings of the existing spark version in the project which is Spark 2.3

Could anyone let me know what do I do to fix the issue so that the code runs properly?

Can you use the property in your spark-submit

   --conf "spark.driver.userClassPathFirst=true"

I think you are getting a jar conflict where the different version of the same jar is being picked up from the environment

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM