Wait for subshell process to complete

Question

processUsageFile()
{
    #sdate=`pin_virtual_time  | awk -F" " '{print $3}'`;

    #Get all new files to be loaded to brm staging data.
    count=`ls ${PRE_STAGING}/TWN* 2>/dev/null|grep -v reprocess|wc -l`
    if [ $count -ne 0 ];then
        # Fork subshell
        (./efx_omc_brm_rpt_process.sh -t TWN & )&
        exitOnError
    fi

    #Process Rapid Report files
    count=`ls $PRE_STAGING/RR* 2>/dev/null|grep -v  reprocess|wc -l`
    if [ $count -ne 0 ];then
        (./efx_omc_brm_rpt_process.sh -t RR &)&
        exitOnError
    fi
...
...
}
#Reprocessing. Process the reprocessed files.
#This method updates the records in the BRM staging table.
reprocessingUsageFile()
{
    #Process TWN fulfillment reprocess files
    count=`ls $PRE_STAGING/TWN*reprocess* 2>/dev/null|wc -l`
    if [ $count -ne 0 ];then
        # Fork subshell
        (./efx_omc_brm_rpt_reprocess.sh -t TWN & ) &
    fi

    #Process Rapid Report files
    count=`ls $PRE_STAGING/RR*reprocess* 2>/dev/null|wc -l`
    if [ $count -ne 0 ];then
        (./efx_omc_brm_rpt_reprocess.sh -t RR &) &
    fi
...
...
}

#Pre processing
PreProcessing

# Start processing usage files.
processUsageFile

processErrFile

The idea of the above code is do parallel processing. All methods invoke multiple subshells and detach from tty. I would like to know if there is way to wait for first two methods to complete execution first and then run the last method.

Waiting for PIDs is somehow not accurate. Still trying...

waitPids() {
echo "Testing $pids -- ${#pids[@]}"
    while [ ${#pids[@]} -ne 0 ]; do
            local range=$(eval echo {0..$((${#pids[@]}-1))})
            local i
            for i in $range; do
                if ! kill -0 ${pids[$i]} 2> /dev/null; then
                    echo "Done -- ${pids[$i]}"
                     unset pids[$i]
                fi
            done
            pids=("${pids[@]}") 
            sleep 1
        done
    }

Answer 1

It seems that the main problem is, that you are using detached subshells.

Maybe the easiest solution would be to use a different mechanism to detach the subshells, so you can use wait .

eg via nohup

 nohup ./process1 &
 nohup ./process2 &
 wait

Answer 2

Use Wait Builtin

$ help wait
wait: wait [-n] [id ...]
    Wait for job completion and return exit status.

    Waits for each process identified by an ID, which may be a process ID or a
    job specification, and reports its termination status.  If ID is not
    given, waits for all currently active child processes, and the return
    status is zero.  If ID is a a job specification, waits for all processes
    in that job's pipeline.

    If the -n option is supplied, waits for the next job to terminate and
    returns its exit status.

    Exit Status:
    Returns the status of the last ID; fails if ID is invalid or an invalid
    option is given.

Minimalist Example

$ wait -n; (sleep 3; false); echo $?
1

Your Code as Example

Background tasks return immediately. The trick for you will be to wrap your functions in a subshell so that you're waiting for the subshell (rather than background jobs) to complete. For example:

$ wait -n; (processUsageFile); echo $?

If you want to get more complicated than that, you're going to have to capture the PID of the background tasks you're spawning in variables so that you can wait for specific processes with a construct like wait $pidof_process_1 $pidof_process_2 .

Wrapping the function in a subshell is just easier. However, your specific needs may vary.

Answer 3

Possibly the 'wait' command between process and reprocess.

from: http://www.tldp.org/LDP/abs/html/subshells.html

Example 21-3. Running parallel processes in subshells

(cat list1 list2 list3 | sort | uniq > list123) &
(cat list4 list5 list6 | sort | uniq > list456) &
# Merges and sorts both sets of lists simultaneously.
# Running in background ensures parallel execution.
#
# Same effect as
#   cat list1 list2 list3 | sort | uniq > list123 &
#   cat list4 list5 list6 | sort | uniq > list456 &

wait   # Don't execute the next command until subshells finish.

diff list123 list456

Answer 4

The best way I found to parallelize and wait is to export a function to use in subshell and use xargs with -P for maximum number of parallel threads while feeding specific number of arguments to the work function with -n or -L.

from: https://man7.org/linux/man-pages/man1/xargs.1.html

       -P max-procs, --max-procs=max-procs
              Run up to max-procs processes at a time; the default is 1.
              If max-procs is 0, xargs will run as many processes as
              possible at a time.  Use the -n option or the -L option
              with -P;

Sample code:

# define some work function and export it
function unit_action() {
  echo action $*
  sleep 5
  echo action $* done
}
export -f unit_action

# list all arguments to feed into function
# with 2 parameters at a time in a maximum of 3 parallel threads
echo {1..9} | xargs -t -n 2 -P 3 bash -c 'unit_action $@' --
echo all done

xargs will implicitly wait until all input is consumed so no need for explicit wait command.

Wait for subshell process to complete

Question

4 answers

solution1
6 2016-03-18 19:14:09

solution2
5 2016-03-18 18:57:37

Use Wait Builtin

Minimalist Example

Your Code as Example

solution3
1 2016-03-18 18:49:36

solution4
0 2021-04-30 12:17:29

Wait for subshell process to complete

Question

4 answers

solution1 6 2016-03-18 19:14:09

solution2 5 2016-03-18 18:57:37

Use Wait Builtin

Minimalist Example

Your Code as Example

solution3 1 2016-03-18 18:49:36

solution4 0 2021-04-30 12:17:29

solution1
6 2016-03-18 19:14:09

solution2
5 2016-03-18 18:57:37

solution3
1 2016-03-18 18:49:36

solution4
0 2021-04-30 12:17:29