简体   繁体   English

如何可靠地终止使用 spark-submit 提交的 spark 应用程序

[英]How to kill spark applications submitted using spark-submit reliably

I'm seeing a strange problem.我看到一个奇怪的问题。 I have a spark cluster in standalone mode.我有一个独立模式的火花集群。 I submit spark jobs from a remote node as follows from the terminal我从终端按如下方式从远程节点提交 spark 作业

$> spark-submit --master spark://10.1.40.18:7077  --class com.test.Ping spark-jobs.jar

when the app is running , when I press ctrl-C on the console terminal, then the process is killed and so is the app in the spark master UI.当应用程序运行时,当我在控制台终端上按 ctrl-C 时,进程被终止,spark 主 UI 中的应用程序也被终止。 When I go to spark master ui, i see that this app is in state Killed under Completed applications, which is what I expected to see.当我转到 spark master ui 时,我看到此应用程序在已完成的应用程序下处于已终止状态,这正是我希望看到的。

Now, I created a shell script as follows to do the same现在,我创建了一个如下的 shell 脚本来做同样的事情

#!/bin/bash
spark-submit --master spark://10.1.40.18:7077  --class com.test.Ping spark-jobs.jar &
echo $! > my.pid

When I execute the shell script from terminal, as follows当我从终端执行shell脚本时,如下

$> bash myscript.sh

The application is submitted correctly to spark master and I can see it as one of the running apps in the spark master UI.该应用程序已正确提交给 spark master,我可以将其视为 spark master UI 中正在运行的应用程序之一。 But when I kill the process in my terminal as follows但是当我按如下方式在终端中终止进程时

$> ps kill $(cat my.pid)

I see that the process is killed on my machine but the spark application is still running in spark master!我看到该进程在我的机器上被终止,但 spark 应用程序仍在 spark master 中运行! It doesn't get killed.它不会被杀死。

I noticed one more thing that, when I launch the spark job via shell script and kill the application from spark master UI by clicking on "kill" next to the running application, it gets killed in spark ui but I still see the process running in my machine.我还注意到一件事,当我通过 shell 脚本启动 spark 作业并通过单击正在运行的应用程序旁边的“kill”从 spark master UI 杀死应用程序时,它在 spark ui 中被杀死,但我仍然看到进程在运行我的机器。

In both cases, I would expect the remote spark app to be killed and my local process to be killed.在这两种情况下,我都希望远程 spark 应用程序被终止,而我的本地进程也将被终止。

Why is this happening?为什么会这样? and how can I kill a spark app from the terminal launced via shell script wo going to the spark master UI?以及如何从通过 shell 脚本启动的终端启动 spark 应用程序并转到 spark 主用户界面?

I want to launch the spark app via script and log the pid so i can monitor it remotely我想通过脚本启动 spark 应用程序并记录 pid 以便我可以远程监控它

thanks for the help谢谢您的帮助

I solved the first issue by adding a shutdown hook in my code.我通过在代码中添加关闭钩子解决了第一个问题。 The shutdown hook get call when you exit your script (ctrl-C , kill … but nor kill -9)当您退出脚本时,关闭钩子会被调用(ctrl-C,kill ...但也不是 kill -9)

val shutdownHook = scala.sys.addShutdownHook {
try {

        sparkContext.stop()
//Make sure to kill any other threads or thread pool you may be running 
      }
      catch {
        case e: Exception =>
          {
            ...

          }
      }

    }

For the other issue , kill from the UI.对于另一个问题,从 UI 杀死。 I also had the issue.我也有这个问题。 This was caused by a thread pool that I use.这是由我使用的线程池引起的。

So I surrounded my code with try/finally block to guarantee that the thread pool was shutdown when spark stopped所以我用 try/finally 块包围了我的代码,以保证当 spark 停止时线程池被关闭

I hope this helps我希望这有帮助

使用spark杀死任务的方法是:

park-submit --kill [submission ID] --master [spark://...]

This is the approach I follow when I want to kill a specific SPARK job that is running on the cluster mode and with the new version of the application I want to start it again, so handling this programmatically is the best way to do it.当我想终止在集群模式下运行的特定SPARK作业并且我想使用新版本的应用程序再次启动它时,我会采用这种方法,因此以编程方式处理它是最好的方法。

I follow two shell scripts to stop and start .我遵循两个 shell 脚本来停止启动

the start.sh is similar to what we all know spark-submit script start.sh类似于我们都知道的spark-submit脚本

Another one is stop.sh and code is below.另一个是stop.sh ,代码如下。

I invoke stop.sh first and then I invoke start.sh (via CI/CD), that is up to you how you automate end to end deployment.我先调用stop.sh ,然后调用start.sh (通过 CI/CD),这取决于您如何自动化端到端部署。

This piece of code will kill a particular job (if running) given its name as described in the spark-submit "name" parameter.这段代码将终止特定作业(如果正在运行),如spark-submit “name”参数中所述。

Also to be on a safer side I propose trimming of any leading or trailing whitespace.为了更安全,我建议修剪任何前导或尾随空格。 Execute these in the same cluster where the SPARK jobs are running.在运行 SPARK 作业的同一个集群中执行这些。

#!/bin/bash
echo "stop.sh invoked"
export APPLICATION_NAME=my-spark-job-name
echo "Killing $APPLICATION_NAME"
APPLICATION_ID=$(yarn application --appStates RUNNING --list 2>/dev/null | awk "{ if (\$2 == \"$APPLICATION_NAME\") print \$APPLICATION_NAME }")
if ["$APPLICATION_ID" = ""]
then
        echo "$APPLICATION_NAME not running."
else
        APPLICATION_ID="${APPLICATION_ID#"${APPLICATION_ID%%[![:space:]]*}"}"
        APPLICATION_ID="${APPLICATION_ID%"${APPLICATION_ID##*[![:space:]]}"}"
        yarn application --kill $APPLICATION_ID 2>/dev/null
        echo $APPLICATION_NAME " Successfully Killed!"
fi

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 spark-submit 运行转换为二进制文件的 .py 文件(在 linux 中) - How to run a .py file converted to binary using spark-submit ( In linux) 如何将spark-submit的整个输出重定向到文件 - How to redirect entire output of spark-submit to a file Spark-submit 在错误的目录中查找 - Spark-submit looking in wrong directory 编写STDOUT时,python子进程模块因spark-submit命令而挂起 - python subprocess module hangs for spark-submit command when writing STDOUT 由于“ --jars”参数(具有*通配符)未展开,因此无法通过系统调用从scala内调用“ spark-submit” - Not able to call “spark-submit” from within scala via system call apparently due to “--jars” parameter (having *wildcard) not being expanded 输入 spark-shell 时找不到 spark submit - Can't find spark submit when typing spark-shell 使用Spark Submit从Linux FS加载文件 - Load file from Linux FS with spark submit 如何从Android中的系统应用程序杀死其他应用程序 - How to kill other applications from a system application in Android 使用spark-shell启动spark时发生异常:错误:未找到:value spark - Exception while launching spark using spark-shell: error: not found: value spark 使用Spark的s3上传性能不佳 - poor s3 upload performance using spark
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM