简体   繁体   中英

Use python to automate creation of bash scripts

I'm trying to get some help for a problem which I've laid out here , and from some further research I think that a Python script might be the answer. That said, I'm new to Python and not sure how to implement what I have in mind, or if it would be the right thing.

Essentially I think I need a python script which can take variables which I pass it and then write those variables into .sh files. Is there a simple way to do this?

EDIT: In response to a couple of comments, I think I ought to spell out my problem a little more.

I'm running a matlab function via a SLURM script. The SLURM script is (I think) a kind of bash script but specifically for scheduling jobs on an HPC. My problem is that I want to, for example, submit ten jobs at once, but all with a particular variable changed to some value. Now, the problem is I can't just do this in a simple way because as far as I can tell there's no good way of passing variables to SLURM scripts. So what I'm doing currently is literally having ten versions of the submission script, each with their own fixed variable - and then when I want to submit all the jobs I open each of these ten scripts and update the shared variable, manually, and then run them one by one. I think what I'm after is a python script which will go into each of these SLURM scripts and edit them.

There are two options that could solve your problem:

  1. You could use slurm arrays

An example below cycles through the numbers 1-16:

#!/bin/bash

#SBATCH --job-name=array
#SBATCH --output=array_%A_%a.out
#SBATCH --error=array_%A_%a.err
#SBATCH --array=1-16
#SBATCH --time=01:00:00
#SBATCH -p partition-name
#SBATCH --ntasks=1
#SBATCH --mem=4G

# Print the task id.
echo "My SLURM_ARRAY_TASK_ID: " $SLURM_ARRAY_TASK_ID

# run code
./exec ${SLURM_ARRAY_TASK_ID}-inputfile.i
  1. You could use python Template strings

Here is an example

in generate-scripts.py below it cycles through 0-10, but you could make it substitute in strings or anything--it's powerful:

template = None
# Create input files from boiler plate
with open(templatefname, "r") as template_f:
    template_string = template_f.read()
    template = Template(template_string)

for i in range(10):
    newfilestrin = template.safe_substitute(var=str(i))

then in your template file you could have:

# blah blah
./run ${var}

I've used both solutions. SLURM arrays are convenient if your changing inputs are just increasing integers. Python template strings or files are more powerful, but a bit more work.

To answer your actual SLURM-related question, it looks like you should just use --export when creating the job to define variables, which are then available as environment variables for your job:

sbatch --export=A=5,b='test' jobscript.sbatch

Quoting the manual for completeness:

--export=<environment variables [ALL] | NONE>

Identify which environment variables from the submission environment are propagated to the launched application. By default, all are propagated. Multiple environment variable names should be comma separated. Environment variable names may be specified to propagate the current value (eg "--export=EDITOR") or specific values may be exported (eg "--export=EDITOR=/bin/emacs"). [...]

This creates files with the same content but different numbers

    def writeFile(number):
        f = open(filename+str(number)+".slurm","w")
        f.write(content+str(number)+content)
        f.close()

I would suggest simple-slurm , a Python wrapper for Slurm I developed. Then, you can use simple Python logic and variables to create the jobs.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM