NoClassDefFoundError:org/apache/spark/sql/hive/HiveContext

Question

I am trying to use oozie to call spark jobs.我正在尝试使用 oozie 来调用 spark 作业。 And the spark job can be run successfully without oozie using spark-submit:并且可以使用 spark-submit 在没有 oozie 的情况下成功运行 spark 作业：

spark-submit --class xxx --master yarn-cluster --files xxx/hive-site.xml --jars xxx/datanucleus-api-jdo-3.2.6.jar,xxx/datanucleus-rdbms-3.2.9.jar,xxx/datanucleus-core-3.2.10.jar xxx.jar

But when I try to use oozie to call the job, it will always failed with the following error.但是当我尝试使用 oozie 调用作业时，它总是会失败并出现以下错误。 I have involved the 3 external jars and hive-site.xml in the workflow.xml我在workflow.xml中涉及了3个外部罐子和hive-site.xml

Launcher exception: org/apache/spark/sql/hive/HiveContext
java.lang.NoClassDefFoundError: org/apache/spark/sql/hive/HiveContext
    at xxx$.main(xxx.scala:20)
    at xxx.main(xxx.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:104)
    at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:95)
    at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47)
    at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:38)

The 20th line of my scala code is:我的 Scala 代码的第 20 行是：

val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

Does anyone have any idea about this error?有没有人对这个错误有任何想法？ I have been stuck for several days.我已经卡了好几天了。

Thank you!谢谢！

Answer 1

Just came back to answer my own question.刚回来回答我自己的问题。 This one ends up being solved by updating the shared lib of oozie.这个问题最终通过更新 oozie 的共享库来解决。 Basically, the jars in the shared lib are not complete for my job to run.基本上，共享库中的 jars 对于我的工作运行来说是不完整的。 So I first imported some additional jars such as spark-hive and spark-mllib.所以我首先导入了一些额外的 jar，比如 spark-hive 和 spark-mllib。 Also the jars provided in oozie shared lib were too old, which also needed to be updated to avoid some potential errors. oozie 共享库中提供的 jar 也太旧了，也需要更新以避免一些潜在的错误。

NoClassDefFoundError:org/apache/spark/sql/hive/HiveContext

问题描述

1 个解决方案

解决方案1
1 2016-05-19 16:21:40

NoClassDefFoundError:org/apache/spark/sql/hive/HiveContext

问题描述

1 个解决方案

解决方案1 1 2016-05-19 16:21:40

解决方案1
1 2016-05-19 16:21:40