I'm trying to submit an application to my spark cluster (standalone mode) through the spark-submit command. I'm following the official spark documentation , as well as relying on this other one . Now the problem is that I get strange behaviors. My setup is the following:
/home/myuser/jars
/home/myuser/jars
), and is called dat-test.jar
dat-test.jar
is at the package path my.package.path.Test
spark://master:7077
Now, I submit the application directly on the master node, thus using the client
deploy mode, running the command
./spark-submit --class my.package.path.Test --master spark://master:7077 --executor-memory 5G --total-executor-cores 10 /home/myuser/jars/*
and I received an error as
java.lang.ClassNotFoundException: my.package.path.Test
If I activate the verbose
mode, what I see is that the primaryResource
selected as jar containing the entry point is the first jar by alphabetical order in /home/myuser/jars/
(that is not dat-test.jar
), leading (I supppose) to the ClassNotFoundException
. All the jars in the same directory are anyway loaded as arguments.
Of course if I run
./spark-submit --class my.package.path.Test --master spark://master:7077 --executor-memory 5G --total-executor-cores 10 /home/myuser/jars/dat-test.jar
it finds the Test
class, but it doesn't find other classes contained in other jars. Finally, if I use the --jars
flag and run
./spark-submit --class my.package.path.Test --master spark://master:7077 --executor-memory 5G --total-executor-cores 10 --jars /home/myuser/jars/* /home/myuser/jars/dat-test.jar
I obtain the same result as the first option. First jar in /home/myuser/jars/
is loaded as primaryResource
, leading to ClassNotFoundException
for my.package.path.Test
. Same if I add --jars /home/myuser/jars/*.jar
.
Important points are:
/home/myuser/jars/
are many. I'd like to know if there's a way to include them all instead of using the comma separated syntax --deploy-cluster
on the master node, I don't get the error, but the computation fails for some other reasons (but this is another problem). Which is then the correct way of running a spark-submit in client mode? Thanks
There is no way to include all jars using the --jars option, you will have to create a small script to enumerate them. This part is a bit sub-optimal.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.