简体   繁体   English

提交“SLURM”作业时出现“nohup”问题

[英]`nohup` issue with submitting `SLURM` job

I have a python code main.py that runs bash script, the bash script inturn submits a job job.bash and obtains its JOBID using echo $JOBID | awk {'print $4'}我有一个运行 bash 脚本的 python 代码main.py ,bash 脚本又提交一个作业job.bash并使用echo $JOBID | awk {'print $4'}获取它的JOBID echo $JOBID | awk {'print $4'} . echo $JOBID | awk {'print $4'} If I run python in the terminal, the bash script works and I am able to obtain and echo the JOBID as follows:如果我在终端中运行 python,则 bash 脚本可以工作,并且我能够获取并回显JOBID ,如下所示:

#!/bin/bash 
JOBID=`sbatch ~/job.bash  | tee  output.log`
JOBID=`echo $JOBID | awk {'print $4'}`
echo $JOBID

Running above as part of python works in terminal python main.py , but doing nohup python main.py & , the echo does not print or store JOBID .作为 python 的一部分在上面运行在终端python main.py中运行,但是在执行nohup python main.py &时,回显不打印或存储JOBID

Any reason for this?这有什么原因吗?


I am submitting a slurm job hence the JOBID is the pid from slurm我正在提交一个 slurm 作业,因此JOBID是来自 slurm 的 pid


(Update Jul 17) Looks like the issue is with the command sbatch ~/job.bash | tee output.log (7 月 17 日更新)看起来问题出在命令sbatch ~/job.bash | tee output.log sbatch ~/job.bash | tee output.log , it doesnt get submitted using nohup and hence JOBID never gets stored and echo'd. sbatch ~/job.bash | tee output.log ,它不会使用nohup提交,因此JOBID永远不会被存储和回显。

(Update Jul 18) As per the comments from @pynexj adding set -x in the script results: (7 月 18 日更新)根据@pynexj 在脚本结果中添加set -x的评论:

nohup: ignoring input and redirecting stderr to stdout
+ date
Mon Jul 18 21:46:35 +03 2022
++ sbatch ~/job.bash
++ tee output.log
+ JOBID=
++ echo
++ awk '{print $4}'
+ JOBID=
+ echo

The issue still persists.问题仍然存在。 It appears that nohup is incompatible with sbatch .看来nohupsbatch不兼容。


Question: Why should nohup prevent submission of slurm job?问题:为什么 nohup 应该阻止提交 slurm 作业? Its objective is merely to capture terminate signal?它的目的仅仅是捕获终止信号?

If this problem only happens with nohup present, you can get the benefits of nohup without actually using it with:如果这个问题只发生在nohup存在的情况下,您可以在不实际使用nohup的情况下获得它的好处:

yourscript </dev/null >file.log 2>&1 & disown -h "$!"

This does the following:这将执行以下操作:

  • Redirects stdin from /dev/null with </dev/null使用</dev/null/dev/null重定向标准输入
  • Redirects stdout and stderr to a log file with >file.log 2>&1使用>file.log 2>&1将 stdout 和 stderr 重定向到日志文件
  • Tells the shell not to forward HUP signals to the background process with disown -h "$!"使用disown -h "$!"告诉 shell 不要将 HUP 信号转发到后台进程。

...which is everything nohup does. ...这就是nohup所做的一切。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM