简体   繁体   中英

How do I rerun a bash script skipping over lines which have previously run sucesfully?

I have a bash script which acts as a wrapper for an analysis pipeline. If the script errors out I want to be able to run the script from the point at which the errors occurred by simply re-running the original command. I have set two different traps; one which will remove the last file being generated on a non-zero exit from my script, the other will remove all the temporary files on exit signal = 0 and essentially cleans up the file system at the end of the run. I turned on noclobber in the bash environment which allows my script to skip over lines of the script where files have already been written but this will only do this if I do not set the non-zero exit trap. As soon as I set this trap then it will exit at the first line where noclobber IDs a file it will not overwrite. Is there a way for me to skip over lines of code that have successfully run previously rather than having to re-run my code from the start? I know I could use conditional statements for each line but I thought there might be a neater way of doing this.

set -o noclobber

# Function to clean up temporary folders when script exits at the end
rmfile() { rm -r $1 }

# Function to remove the file being currently generated
# Function executed if script errors out

rmlast() {
if [ ! -z "$CURRENTFILE" ]
then
rm -r $1
exit 1
fi }

# Trap to remove the currently generated file
trap 'rmlast "$CURRENTFILE"' ERR SIGINT

#Make temporary directory if it has not been created in a previous run
TEMPDIR=$(find . -name "tmp*")
if [ -z "$TEMPDIR" ]
then
TEMPDIR=$(mktemp -d /test/tmpXXX)
fi

# Set CURRENTFILE variable
CURRENTFILE="${TEMPDIR}/Variants.vcf"

# Set CURRENTFILE variable
complexanalysis_tool input_file > $CURRENTFILE

# Set CURRENTFILE variable
CURRENTFILE="${TEMPDIR}/Filtered.vcf"

complexanalysis_tool2 input_file2 > $CURRENTFILE

CURRENTFILE="${TEMPDIR}/Filtered_2.vcf"

complexanalysis_tool3 input_file3 > $CURRENTFILE

# Move files to final destination folder
mv -nv $TEMPDIR/*.vcf /test/newdest/

# Trap to remove temporary folders when script finishes running
trap 'rmfile "$TEMPDIR"' 0

Update:

I have been offered answers suggesting the use of the make utility. I want to make use of its inbuilt utility to check if a dependency has been fulfilled. In my hands the makefile suggested by VK Kashyap does not seem to skip execution for previously accomplished tasks. So for example I ran the script above and interrupted the script when it was running filtered.vcf with ctrl c. When I rerun the script again it runs from the beginning again ie starts again at varaints.vcf. Am I missing something in order to get the makefile to show sources as being fullfilled?

Answer to update:

OK this is a rookie mistake but since I am not familiar with generating makefiles I will post this explanation of my error. The reason my makefile was not rerunning from the exit point was that I had named the targets a different name to the output files being generated. So as VK Kashyap quite correctly answered if you name the targets eg.

variants.vcf
filtered.vcf
filtered2.vcf

the same as the output files being generated then the script will skip previously accomplished tasks.

make utility might be an answer for the thing you want to achive.

it has inbuilt dependecy checking (the stuff which you are trying to achive with tmp files)

#run all target when all of the files are available
all: variants.vcf filtered.vcf filtered2.vcf
   mv -nv $(TEMPDIR)/*.vcf /test/newdest/

variants.vcf:
    complexanalysis_tool input_file > variants.vcf

filtered.vcf:
    complexanalysis_tool2 input_file2 > filtered.vcf

filtered2.vcf:
    complexanalysis_tool3 input_file3 > filtered2.vcf

you may use bash script to invoke this make file as:

#/bin/bash

export TEMPDIR=xyz
make -C $TEMPDIR all

make utility will check itself for already accomplished task and skip execution for done stuffs. it will continue where you had the error finishing the task.

you can find more details on internet about exact syntax for makefile.

there is no built-in way to do that.

however, you could brew something like that by keeping track of the last successful line and building your own goto statement, as described here and in Is there a "goto" statement in bash? (just replace the 'labels' with actual line-numbers).

however, the question is whether this is really a smart idea.

a better way is to only run the commands needed, not the commands not-yet-executed. this could be done either by explicit conditionals in your bash-script:

produce_if_missing() {
   # check if first argument is existing
   # if not run the rest of the arguments and pipe it into the first one
   local curfile=$1
   shift
   if [ ! -e "${curfile}" ]; then
     $@ > "${curfile}"
   fi
}

produce_if_missing Variants.vcf complexanalysis_tool input_file
produce_if_missing Filtered.vcf complexanalysis_tool2 input_file2

or using tools that are made for such things (see VK Kahyap's answer using make , though i prefer using variables in the make-rules to minimize typos):

Variants.vcf: input_file
    complexanalysis_tool $^ > $@
Filtered.vcf: input_file
    complexanalysis_tool2 $^ > $@

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM