简体   繁体   English

spark-submit 在退出程序之前不等待 state FINISHED

[英]spark-submit not waiting for state FINISHED before exiting program

I am submitting a spark job to our yarn service via spark-submit.我正在通过 spark-submit 向我们的 yarn 服务提交一个 spark 作业。 From my understanding spark-submit should continue running until there is a state status of FINISHED before moving on.根据我的理解,spark-submit 应该继续运行,直到 state 状态为 FINISHED,然后再继续。 However once submitted through bamboo, spark-submit is exiting and going straight to the wait which then the sql query is going to run.然而,一旦通过 bamboo 提交,spark-submit 就会退出并直接进入等待状态,然后 sql 查询将运行。 But the sql query shouldnt run until the spark job is 100% finished.但是 sql 查询不应该在 spark 作业 100% 完成之前运行。 Not sure why my spark-submit is not waiting.不知道为什么我的 spark-submit 没有等待。 Any help is appreciated, thanks任何帮助表示赞赏,谢谢

nohup spark-submit --name "${APP_NAME}" \
                    --class "${SPARK_CLASS_NAME}" \
                    --files jaas.conf,kafka.properties,distributed.properties,${KEYTAB},pools.xml \
                    --principal ${PRINCIPAL} \
                    --keytab ${KEYTAB_ALT} \
                    --conf "spark.driver.extraJavaOptions=${JVM_ARGS}" \
                    --conf "spark.executor.extraJavaOptions=${JVM_ARGS}" \
                    --conf spark.haplogic.env=${ENV} \
                    --conf spark.scheduler.allocation.file=${POOL_SCHEDULER_FILE} \
                    --conf spark.master=yarn \
                    --conf spark.submit.deployMode=cluster \
                    --conf spark.yarn.submit.waitAppCompletion=true \
                    --conf spark.driver.memory=$(getProperty "spark.driver.memory") \
                    --conf spark.executor.memory=$(getProperty "spark.executor.memory") \
                    --conf spark.executor.instances=$(getProperty "spark.executor.instances") \
                    --conf spark.executor.cores=$(getProperty "spark.executor.cores") \
                    --conf spark.yarn.maxAppAttempts=$(getProperty "spark.yarn.maxAppAttempts") \
                    --conf spark.dynamicAllocation.enabled=$(getProperty "spark.dynamicAllocation.enabled") \
                    --conf spark.yarn.queue=$(getProperty "spark.yarn.queue") \
                    --conf spark.memory.fraction=$(getProperty "spark.memory.fraction") \
                    --conf spark.memory.storageFraction=$(getProperty "spark.memory.storageFraction") \
                    --conf spark.eventLog.enabled=$(getProperty "spark.eventLog.enabled") \
                    --conf spark.serializer=org.apache.spark.serializer.JavaSerializer \
                    --conf spark.acls.enable=true \
                    --conf spark.admin.acls.groups=${USER_GROUPS} \
                    --conf spark.acls.enable.groups=${USER_GROUPS} \
                    --conf spark.ui.view.acls.groups=${USER_GROUPS} \
                    --conf spark.serializer=org.apache.spark.serializer.JavaSerializer \
                    --conf spark.yarn.appMasterEnv.SECRETS_LIB_MASTER_KEY=${SECRETS_LIB_MASTER_KEY} \
                    ${JARFILE_NAME} >> ${LOG_FILE} 2>&1 &
sleep 90

The issue was a bash knowledge issue, & commands runs spark-submit on the background so bamboo will act as if it was complete after the sleep.问题是 bash 知识问题,& 命令在后台运行 spark-submit,因此 bamboo 会像睡眠后完成一样。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM