简体   繁体   中英

How can I run Spark job programmatically

I wan't to run Spark job programmatically - submit SparkPi calculation to remote cluster directly from Idea (my laptop):

object SparkPi {

  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("Spark Pi")
      .setMaster("spark://host-name:7077")
    val spark = new SparkContext(conf)
    val slices = if (args.length > 0) args(0).toInt else 2
    val n = 100000 * slices
    val count = spark.parallelize(1 to n, slices).map { i =>
      val x = random * 2 - 1
      val y = random * 2 - 1
      if (x * x + y * y < 1) 1 else 0
    }.reduce(_ + _)
    println("Pi is roughly " + 4.0 * count / n)
    spark.stop()
  }

}

However, when I run it, I observe the following error:

14/12/08 11:31:20 ERROR security.UserGroupInformation: PriviledgedActionException as:remeniuk (auth:SIMPLE) cause:java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException: Unknown exception in doAs
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1421)
    at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:52)
    at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:113)
    at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:156)
    at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: java.security.PrivilegedActionException: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    ... 4 more

When I run the same script with spark-submit from my laptop, I see the same error.

And only when I upload the jar to remote cluster (machine, where master is running), job complete successfully:

./bin/spark-submit --master spark://host-name:7077 --class com.viaden.crm.spark.experiments.SparkPi ../spark-experiments_2.10-0.1-SNAPSHOT.jar

According to the exception stack, it should be your local firewall issue.

please refer to this similar case Intermittent Timeout Exception using Spark

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM