简体   繁体   English

如何使用 Spark-submit 命令获取提交到 Spark 集群的作业的应用程序 ID/作业 ID?

[英]How to get application Id/Job Id of job submitted to Spark cluster using Spark-submit command ?

I am submitting Apache Spark job using spark-submit command.我正在使用 spark-submit 命令提交 Apache Spark 作业。 I want to retrieve application Id or Job Id of the job submitted using spark-submit command.我想检索使用 spark-submit 命令提交的作业的应用程序 ID 或作业 ID。 What should be the recommended way?推荐的方式应该是什么?

Output of spark-submit command can be parsed to get the application id.可以解析 spark-submit 命令的输出以获取应用程序 ID。 This is the line you should be looking at -这是你应该看的线 -

2018-09-08 12:01:22 INFO StandaloneSchedulerBackend:54 - Connected to Spark cluster with app ID app-20180908120122-0001 2018-09-08 12:01:22 INFO StandaloneSchedulerBackend:54 - 使用应用程序 ID app-20180908120122-0001 连接到 Spark 集群

appId=`./bin/spark-submit <options> 2>&1 | tee /dev/tty | grep -i "Connected to Spark Cluster" | grep -o app-.*[0-9]`
echo $appId
app-20180908120122-0001

Your use case is not clear but if you are looking for application id after job is completed then this could be helpful.您的用例不清楚,但如果您在作业完成后查找应用程序 ID,那么这可能会有所帮助。 This line may be different for yarn and other clusters.对于纱线和其他簇,这条线可能不同。

Since it's not clear if you want it programatically in the app, i'll assume you do, You can get the yarn application id or job id (in local mode) with the following,由于不清楚您是否希望在应用程序中以编程方式使用它,我假设您这样做,您可以通过以下方式获取纱线应用程序 ID 或作业 ID(在本地模式下),

val sparkSession: SparkSession = ???
val appID:String = sparkSession.sparkContext.applicationId

Hope this answers your question.希望这能回答你的问题。

you can get Running Streaming Job By their UUID or query name您可以通过他们的 UUID 或查询名称获得正在运行的流媒体作业

Like this : sparkSession.streams.active.get(UUID) (where UUID is Job RunId)像这样: sparkSession.streams.active.get(UUID) (其中 UUID 是 Job RunId)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 集群部署模式下的spark-submit将应用程序ID提供给控制台 - spark-submit in cluster deploy mode get application id to console 如何在不使用spark-submit的情况下将java程序中的spark作业提交到独立的spark集群? - How to submit spark job from within java program to standalone spark cluster without using spark-submit? spark-submit 作业在集群模式下不起作用 - spark-submit job is not working on cluster mode 如何使用spark-submit为Spark作业选择队列? - How to choose the queue for Spark job using spark-submit? 在整个集群中运行带有Spark提交的Spark作业 - Running a Spark job with spark-submit across the whole cluster 如何使用 spark-submit 在 Kubernetes (minikube) 上提交 PySpark 作业 - How to submit PySpark job on Kubernetes (minikube) using spark-submit 如何通过spark-submit获取spark SUBMISSION_ID? - How to get spark SUBMISSION_ID with spark-submit? 在集群部署模式下 spark-submit 后获取特定文件中的应用程序 ID - To get application id in particular file after spark-submit in cluster deploy mode 提交火花的工作绩效 - Spark-submit job performance 如何通过火花提交作业以在minikube创建的本地kubernetes上触发集群 - how to spark-submit job to spark cluster on local kubernetes created by minikube
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM