简体   繁体   中英

bash script to run a constant number of jobs in the background

I need a bash script to run some jobs in the background, three jobs at a time.

I know can do this in the following way, and for illustration, I will assume the number of jobs is 6:

./j1 &
./j2 &
./j3 &
wait
./j4 &
./j5 &
./j6 &
wait

However, this way, if, for example, j2 takes a lot longer to run that j1 and j3, then, I will be stuck with only one background job running for a long time.

The alternative (which is what I want) is that whenever one job is completed, bash should start the next job in the queue so that a rate of 3 jobs at any given time is maintained. Is it possible to write a bash script to implement this alternative, possibly using a loop? Please note that I need to run far more jobs, and I expect this alternative method to save me a lot of time.

Here is my draft of the script, which I hope you can help me to verify its correctness and improve it, as I'm new to scripting in bash. The ideas in this script are taken and modified from here , here , and here ):

for i in $(seq 6)
do
   # wait here if the number of jobs is 3 (or more)
   while (( (( $(jobs -p | wc -l) )) >= 3 )) 
   do 
      sleep 5      # check again after 5 seconds
   done

   jobs -x ./j$i &
done
wait

IMHO, I think this script does the required behavior. However, I need to know -from bash experts- if I'm doing something wrong or if there is a better way of implementing this idea.

Thank you very much.

With GNU xargs:

printf '%s\0' j{1..6} | xargs -0 -n1 -P3 sh -c './"$1"' _

With bash (4.x) builtins:

max_jobs=3; cur_jobs=0
for ((i=0; i<6; i++)); do
  # If true, wait until the next background job finishes to continue.
  ((cur_jobs >= max_jobs)) && wait -n
  # Increment the current number of jobs running.
  ./j"$i" & ((++cur_jobs))
done
wait

Note that the approach relying on builtins has some corner cases -- if you have multiple jobs exiting at the exact same time, a single wait -n can reap several of them, thus effectively consuming multiple slots. If we wanted to be more robust, we might end up with something like the following:

max_jobs=3
declare -A cur_jobs=( ) # build an associative array w/ PIDs of jobs we started
for ((i=0; i<6; i++)); do
  if (( ${#cur_jobs[@]} >= max_jobs )); then
    wait -n # wait for at least one job to exit
    # ...and then remove any jobs that aren't running from the table
    for pid in "${!cur_jobs[@]}"; do
      kill -0 "$pid" 2>/dev/null && unset cur_jobs[$pid]
    done
  fi
  ./j"$i" & cur_jobs[$!]=1
done
wait

...which is obviously a lot of work, and still has a minor race. Consider using xargs -P instead. :)

Using GNU Parallel:

parallel -j3 ::: ./j{1..6}

Or if your shell does not do .. expansion (eg csh):

seq 6 | parallel -j3 ./j'{}'

If you think you cannot install GNU Parallel, please read http://oletange.blogspot.dk/2013/04/why-not-install-gnu-parallel.html and leave a comment on why you cannot install it.

Maybe this could assist..

Sample usecase : run 'sleep 20' 30 times, just as an example. It could be any job or another script. Our control logic is to keep checking whether "how many already fired?" is less than or equal to "max processes defined", inside a while loop. If not, fire one and if yes, sleep .5 seconds.

Script output: In the below snip, it is observed that now we have 30 'sleep 20' commands running in the background, as we configured max=30.

%_Host@User> ps -ef|grep 'sleep 20'|grep -v grep|wc -l
30
%_Host@User>

Change value of no. of jobs at runtime : Script has a param "max", which takes value from a file "max.txt"( max=$(cat max.txt) ) and then applies it in each iteration of the while loop. As seen below, we changed it to 45 and now we have 45 'sleep 20' commands running in the background. You can put the main script in background and just keep changing the max value inside " max.txt " to control.

%_Host@User> cat > max.txt
45
^C
%_Host@User> ps -ef|grep 'sleep 20'|grep -v grep|wc -l
45
%_Host@User>

Script:

#!/bin/bash
#---------------------------------------------------------------------#
proc='sleep 20' # Your process or script or anything..
max=$(cat max.txt)  # configure how many jobs do you want
curr=0
#---------------------------------------------------------------------#
while true
do
  curr=$(ps -ef|grep "$proc"|grep -v grep|wc -l); max=$(cat max.txt)
  while [[ $curr -lt $max ]]
        do
    ${proc} &        # Sending process to background.
    max=$(cat max.txt) # After sending one job, again calculate max and curr
    curr=$(ps -ef|grep "$proc"|grep -v grep|wc -l)
  done
  sleep .5    # sleep .5 seconds if reached max jobs.
done
#---------------------------------------------------------------------#

Let us know if it was any useful.

This is how I do it:

  1. Enable jobs in our script:

     set -m
  2. Create a trap which kills all jobs if the script is interrupted:

     trap 'jobs -p | xargs kill 2>/dev/null;' EXIT
  3. Use a loop to start a maximum of 3 jobs in background

    for i in $(seq 6); do while [[ $(jobs | wc -l) -ge 3 ]]; do sleep 5 done./j"$i" & done
  4. Finally bring our background jobs back to the foreground:

     while fg >/dev/null 2>&1; do echo -n "" # output nothing done

Because of the last part the script does not exit as long jobs are running and it avoids that jobs get killed by trap .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM