I have several Spark
big data applications written in Scala
. These applications have their other version written in R
.
I also have a web server application written in Java
. This is provided as API for the web GUI. The purpose is to enable GUI to execute these applications and choose the version: R
or Spark
. I managed to call the R
code from the Java
API and get the result to JSON
. But now it seems to be quite complicated to execute the Spark
programs.
Until now, I was able to merge one of the Scala
.jar file with the Java
API with Maven
. I do this by placing my Spark
program as a local repository in pom.xml
so that the Scala
code is included in the final .jar package. I also mentioned Scala
and breeze
library as dependencies in the pom.xml
. And when I try to send a request with the API, of course it throws an error saying java.lang.NoClassDefFoundError: org/apache/spark/sql/SparkSession$
. By this point, I realize that it was because I haven't mentioned Spark
library in Maven
dependencies, but then I think I've been doing it wrong, since Spark
applications are generally run by executing spark-submit
command in terminal.
So now what I'm thinking is putting the Java
API .jar and Scala
.jar in one folder, and then executing spark-submit
from inside of Java
API .jar, targeting the Scala
.jar. Is this even correct? And how to execute the spark-submit
from Java code? Does it have to be using Runtime.exec()
as mentioned in here ?
SparkLauncher
can be used to submit the spark code(written in scala with precomplied jar scala.jar
placed at certain location) from the Java Api code.
The saprk documentaion for using SparkLauncher recommends the below way to submit the spark job pro-grammatically from inside Java Applications. Add the below code in your Java Api code.
import org.apache.spark.launcher.SparkAppHandle;
import org.apache.spark.launcher.SparkLauncher;
public class MyLauncher {
public static void main(String[] args) throws Exception {
SparkAppHandle handle = new SparkLauncher()
.setAppResource("/my/scala.jar")
.setMainClass("my.spark.app.Main")
.setMaster("local")
.setConf(SparkLauncher.DRIVER_MEMORY, "2g")
.startApplication();
// Use handle API to monitor / control application.
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.