简体   繁体   中英

python launch/wait on “paired” processes in parallel (probably popen/wait/subprocess?)

I'm reasonably sure some functionality already exists to do this, but I haven't been able to find it. What I'm basically trying to do is what we'd write in BASH as:

( sleep 1; echo "one" ) &
( sleep 2; echo "two" ) &
( sleep 3; echo "three" ) &
( sleep 4; echo "four" ) &

and have the entire thing execute in four seconds (not ten seconds)...

More generally, consider the case where I have a long list of processes that I need to run: (A1, B1, C1, D1 ...). I also have a list of "paired" processes, let's call them (A2, B2, C2, D2...) When any 'x1' processes finishes, I want to launch the corresponding 'x2' process, but I want to have all the x1 processes start in parallel, and as each one finishes I want the x2 process to be launched.

I've figured out how to use subprocess.Popen, push each instance into a list, and then wait on all of those to finish, but I've only been able to wait on the entire initial set, and THEN fire off the second set. This is better, but not ideal by any stretch. This seems like something that shouldn't be too terribly hard, but I haven't been able to find it.

Another way to think about this is that if I have ten paired processes, immediately after invocation I'll have ten running processes, and ten OTHER processes each waiting on one of the first ten to finish.

(This actually needs to solve a larger, more general problem but once I can solve this case I can generalize and scale it...)

You can solve this in Python in many different ways. Here are three obvious ones:

  • The same way the shell solves it. This is probably the easiest, at least on a Unix-like system, but it enforces some separation you may not want and does not work on Windows.
  • Through polling. If you have nothing to do while polling, this may be a waste of system resources.
  • Through threading. This is the lightest weight, but also the trickiest to get right.

The way the shell handles this is that:

( sleep 1; echo "one" ) &

forks off a sub-shell. The sub-shell forks off a sub-sub-shell, and the sub-sub-shell exec s the sleep 1 . The first sub-shell now waits for the second sub-shell, and when that finishes, execs (no fork required this time) echo "one" . (Meanwhile the main shell does not wait at all.)

Note that the number of processes here was 3: the main shell, the sub-shell, the sub-sub-shell which became the first echo, and then the first sub-shell became the second echo. The main shell can wait for, and hence get the result-status of, the first sub-shell, but it cannot see the sub-sub-shell at all.

To do this directly in Python, either call a shell to run your two commands in sequence—this shell will fork once for the first command, then run the second directly—or use os.fork() . If you're the child of the fork, use subprocess (with .call or .Popen or whatever) to run the first command, then use os.exec to runs the second command. Alternatively, you can add yet more processes, and/or use multiprocessing (which adds a fancy communications mechanism between your various Python processes, so that you can do a lot more useful things, but it's even heavier-weight).

To use polling, note that a subprocess.Popen instance has a poll method. Call this to tell whether the process is still running, or has finished.

To use threading, spin up threads that invoke subprocess.Popen and/or that invoke the .wait method on the created subprocess and then spin up the next one in the chain. You'll need to add your own locking around any variables shared across the various threads (such as the various work-lists—it may make sense to divide them up before spinning up the threads, so that each thread has a private work-list and merely contributes the final results, if any, under a lock).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM