简体   繁体   中英

Submit Application in Apache Spark

I am a newbie to Apache Spark and trying to create a simple application to run it in local mode.I realized that it has scripts like spark-submit to submit the application.

I am looking for something similar like Apache Storm's LocalCluster.submitTopology() to submit the application programmatically. Please point me to the equivalent API in Spark. Appreciate help on this. Thanks.

I believe you can do this in your main:

 SparkSession sparkSession = SparkSession
            .builder()
            .master("local[2]")
            .appName("appName")
            .getOrCreate();

in 2.0.

In spark 1.6 you'd:

SparkConf sparkConf = new SparkConf().setAppName("appName").setMaster("local[2]")
SparkContext sc = new SparkContext(sparkConf)

So you cam run a spark application either in cluster mode or in local mode. in case of cluster you can either go for yarn,mesos cluster or spark standalone cluster.

if you want to submit your application to yarn or mesos you have to package your spark app into a fat jar and then submit it from a console using spark-submit.

if you want to run a spark app in cluster programmatically you have to setup spark standalone cluster and provide the ip address of the master node in the setMaster() property. Now the app will run in the cluster.

    SparkConf sparkConf = new SparkConf().setAppName("appName").setMaster("spark://sparkmasterip:7077")
    SparkContext sc = new SparkContext(sparkConf)

if you want to run a spark app in local mode programmatically you have to setup spark libraries in your project and provide the no. of threads to be used in your app in setMaster() property. Now the app will run in the local mode.

SparkConf sparkConf = new SparkConf().setAppName("appName").setMaster("local[8]")
        SparkContext sc = new SparkContext(sparkConf)

You can use the SparkLauncher , in the package summary the library is described as follows:

This library allows applications to launch Spark programmatically. There's only one entry point to the library - the SparkLauncher class.

With it you can launch a Spark application like this:

import org.apache.spark.launcher.SparkAppHandle;
import org.apache.spark.launcher.SparkLauncher;

public class MyLauncher {
  public static void main(String[] args) throws Exception {
    SparkAppHandle handle = new SparkLauncher()
      .setAppResource("/my/app.jar")
      .setMainClass("my.spark.app.Main")
      .setMaster("local")
      .setConf(SparkLauncher.DRIVER_MEMORY, "2g")
      .startApplication();
    // Use handle API to monitor / control application.
  }
}

This gives you a SparkAppHandle to control your Spark application. It is also possible to launch a raw process, but it is recommended to use the way shown above.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM