简体   繁体   中英

java.lang.NoClassDefFoundError spark-submit using class from other project

I wrote some simple spark-java code. I use maven to compile it in eclipse. And then start it with spark-submit. Everything works fine.

But now I tried to use a class from another project in Eclipse (which is no maven project). It is OpenRefine(googlerefine). And also I want to use a json.jar which I added to the buildpath in eclipse.

So I imported it like this:

import org.json.simple.parser.JSONParser;
import com.google.refine.operations.OnError; //form other project
import com.google.refine.operations.cell.TextTransformOperation; //from other project

And eclipse doesn't mark it as wrong. Also compiling with maven gives me "BUILD SUCCESS".

But when running it I get this error:

Exception in thread "main" java.lang.NoClassDefFoundError: com/google/refine/operations/OnError
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2531)
    at java.lang.Class.getMethod0(Class.java:2774)
    at java.lang.Class.getMethod(Class.java:1663)
    at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:325)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: com.google.refine.operations.OnError
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.sec

When deleting my Code with com.google.refine.operations.OnError I get the same error with the JSONParser.

Can anyone help me? I don't know what to do

Edit: Now the json.jar works for me with adding this parameter when calling spark-submit:

--jars /path/to/json-simple-1.1.jar

The other classes are no .jar-files. I wonder if it is even possible to add them to the runtime classpath or if I have to build own .jar-files which would be very tricky. Because openRefine is a big project and I have no idea how to get a jar.

这意味着保存com.google.refine.operations.OnError的jar或eclispe项目在您的编译时类路径上,而不在运行时类路径上。

Solution would be to include all OpenRefine Java source code (OpenRefine/main/src) to your maven project's src directory and create a jar file that includes OpenRefine.

Although spark documentation ( link ) suggests that you use maven-shade-plugin to generate jar file that include all dependencies, it doesn't help your case as OpenRefine project doesn't use maven.

Once the jar file is created, you can confirm that it contains OpenRefine classes in it before submitting spark job:

$ jar tf "<the jar file you created>"
...
com/google/refine/operations/OnError.class
...

Once it includes the classes, call spark-submit command with the jar file. Spark drivers and executors can find them in runtime classpath.

Note that OpenRefine has following license. So it doesn't prohibit to include the source code to your project as long as you follow it. https://github.com/OpenRefine/OpenRefine/blob/master/LICENSE.txt

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM