简体   繁体   中英

how to get the Hadoop-spark job's tracking URL or catch the spark-submit output by scala code

I deploy the Hadoop-Spark cluster and run my job. Through the submit script bin/spark-submit, i can success submit the spark jobs. Now, I want to catch the tracking URL just like http://hadoop-01:8088/proxy/application_1446625315279_0017/ and use it in my other scala project. How can i do that? i try to rediect the spark-submit output, but it seems don't work, just like

./bin/spark-submit --class org.apache.spark.examples.mllib.JavaKMeans --master yarn-cluster --num-executors 32 --executor-cores 4 --executor-memory 16G --driver-memory 8G lib/spark-examples-1.4.0-hadoop2.6.0.jar /data/kmeans_data.txt 100 9 >> log.log

But after finish, the log.log still empty.

Another, i try to use scala.sys.process.ProcessIO to catch the spark-submit output, it don't work as i want.Here is the code:

def submitSparkJob(filename: String) = {
  val baseCmd = """/opt/spark-1.4.0-bin-hadoop2.6/bin/spark-submit 
              | --master yarn-cluster
              | --num-executors 32 --executor-cores 4 --executor-memory 16G
              | --driver-memory 8G""".stripMargin.replace("\n", " ")

  val jarEntry = " --class org.apache.spark.examples.mllib.JavaKMeans "
  val jarFile = " /opt/spark-1.4.0-bin-hadoop2.6/lib/spark-examples-1.4.0-hadoop2.6.0.jar"
  val pramas = " /data/" + filename + " 1000 9"
  val cmd = baseCmd + jarEntry + jarFile + pramas

  val pb = Process(cmd)
  val pio = new ProcessIO(_ => (),
                    stdout => scala.io.Source.fromInputStream(stdout)
                      .getLines.foreach(println),
                    _ => ())
  pb.run(pio)
}  

I can see nothing output in the terminator. How can i get the spark job's "tracking URL" and use it in my scala code ? Thank you !

Oh, my god! bin/spark-submit display on the terminator is error information. So, use ./bin/spark-submit args 2>>log.txt to redirect, or change the scala code to:

val pio = new ProcessIO(_ => (),
                    _ => (),
                    err => scala.io.Source.fromInputStream(err)
                      .getLines.foreach( line =>
                        if(pattern.findFirstMatchIn(line) != None){
                          result = line
                          println(line)
                        }
                      )
                    )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM