简体   繁体   中英

How to submit a Scala job to Spark?

I have a Pythons script that I was able to submit to Spark in the following way:

/opt/spark/bin/spark-submit --master yarn-client test.py

Now, I try to submit a Scala program in the same way:

/opt/spark/bin/spark-submit --master yarn-client test.scala

As a result I get the following error message:

Error: Cannot load main class from JAR file:/home/myname/spark/test.scala
Run with --help for usage help or --verbose for debug output

The Scala program itself is just a Hello World program:

object HelloWorld {
    def main(args: Array[String]): Unit = {
        println("Hello, world!")
    }
}

What am I doing wrong?

For starters you'll have to create a jar file. You cannot simply submit Scala source. If in doubt see Getting Started with sbt .

After that just add a class parameter pointing to the HelloWorld . Assuming no packages:

/opt/spark/bin/spark-submit --master yarn-client --class "HelloWorld" path_to.jar

It depends on cluster mode you are using.

Have a look at generic command

./bin/spark-submit \
  --class <main-class>
  --master <master-url> \
  --deploy-mode <deploy-mode> \
  --conf <key>=<value> \
  ... # other options
  <application-jar> \
  [application-arguments]

For yarn-client,

/opt/spark/bin/spark-submit \
  --class "HelloWorld" your_jar_with_scala_file \
  --master yarn-client

Have a look at Spark documentation for better understanding.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM