简体   繁体   English

Slurm-工作状态说失败,但是仍然生成输出

[英]Slurm - Job State says failed, but output still generated

I have a slurm job scheduled and running on a cluster. 我有一个计划好的任务,并且正在群集上运行。 It is a simple sbatch that runs a MATLAB .m file. 这是一个运行MATLAB .m文件的简单sbatch After it finishes running, the output (two graphs) is successfully generated as expected. 完成运行后,将按预期成功生成输出(两个图)。 However, when I do sacct , the job state reads "FAILED" and exit code reads "9:0". 但是,当我执行sacct ,作业状态为“ FAILED”,退出代码为“ 9:0”。 To me it should read COMPLETED instead. 对我来说,它应该改为COMPLETED。

In my sbatch file, I did specify error and output , and the two files are indeed generated with no content. 在我的sbatch文件中,我确实指定了erroroutput ,并且确实生成了两个没有内容的文件。

Can someone please help? 有人可以帮忙吗?

The job final state is dictated by the return code of the submission script, which is in turn the return code of the last command of the submission script. 作业的最终状态由提交脚本的返回代码决定,而提交脚本的返回代码又是提交脚本的最后一个命令的返回代码。 So the most plausible explanation would be that even though the Matlab script runs fine, the last command of the submission script does not. 因此,最合理的解释是,即使Matlab脚本运行良好,提交脚本的最后一个命令也不行。 If Matlab is the last command of the script, then it appears to return a non-zero code, probably because some cleaning tasks could not be performed. 如果Matlab 脚本的最后一个命令,那么它似乎会返回非零代码,这可能是因为某些清理任务无法执行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM