简体   繁体   中英

how do I get the slurm job id?

#!/bin/bash
#SBATCH -N 1      # nodes requested
#SBATCH -n 1      # tasks requested
#SBATCH -c 4      # cores requested
#SBATCH --mem=10  # memory in Mb
#SBATCH -o outfile  # send stdout to outfile
#SBATCH -e errfile  # send stderr to errfile
#SBATCH -t 0:01:00  # time requested in hour:minute:second

module load anaconda
python hello.py jobid

lets say I have this code and I want to send the jobid to python, how do you get the job id, so when I do

sbatch script.sh
Submitted batch job 10514

how do I get the number 10514 and pass it to python?

You can use squeue . Following is the list of valid usage of squeue .

Usage: squeue [-A account] [--clusters names] [-i seconds] [--job jobid]
              [-n name] [-o format] [-p partitions] [--qos qos]
              [--reservation reservation] [--sort fields] [--start]
              [--step step_id] [-t states] [-u user_name] [--usage]
              [-L licenses] [-w nodes] [--federation] [--local] [--sibling]
          [-ahjlrsv]

I will show you how to do it with squeue -u which allows you to use your username. In my case my username is s.1915438 .

Here I submit a job.

[s.1915438@cl2 ~]$ sbatch jupyter.sh 
Submitted batch job 38529784
[s.1915438@cl2 ~]$ squeue -u s.1915438
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
          38529784  gpu_v100 jupyter- s.191543  R       2:09      1 ccs2101

Here the job ID is 38529784. You can also use the USER variable as follows.

[s.1915438@cl2 ~]$ squeue -u $USER
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
          38529784  gpu_v100 jupyter- s.191543  R       0:47      1 ccs2101

If you echo the USER variable then you will see it outputs your username. This is particularly useful when you write scripts.

[s.1915438@cl2 ~]$ echo $USER
s.1915438

You can do the same if you know the job name using squeue -n .

To get this thing in Python you need to use the os library as follows.

>>> import os
>>> a=os.system("squeue -u $USER | tail -1| awk '{print $1}'")
38529793

Here tail is used to obtain the last row and awk selects the column as per our requirement. As an extra, if you want to cancel a job then use scancel as follows.

[s.1915438@cl2 ~]$ scancel 38529784 

Sometimes scancel may take 5-10 seconds.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM