简体   繁体   中英

Python multiprocessing.Pool with processes that crash

Well, they're not supposed to crash, but they do anyway. Is there a way to get multiprocessing.Pool, or any other multiprocessing tool to re-start a process that dies? How would I do this otherwise?

Thanks!

Edit : Some background. The process does several things with geometry in Autodesk Maya. Which it does totally fine. The problem is that every once in a while I'll have a file that decides, once it's finished and a new scene is being opened, to completely exit Maya (or mayapy) with no python warnings or errors, or critical process errors from Windows. It just dies. There's not really anything I can do about the crashing unfortunately.

What I'm hoping for is a way to re-start any processes that have died from a crash.

Indeed the error handling is better in python 3.3 as masida said. Here I check for timeouts when a child process has died silently.

This workaround is for python <3.3 and multiprocessing.pool, of course managing your own processes is a good alternative.

Use pool.map_async to run the processes asynchronously, you can then check if the jobs are done and how long they are taking. If they take too long (for instance when one process died and won't return) -> kill all pool processes with pool.terminate() and start over. In code:

done = False                                   # not finished yet
while not(done):
     job_start = time.time()                   # start time
     Jobs = pool.map_async(args)               # asynchronous pool call  
     redo = False                              # no redo yet
     while not(Jobs.ready()):                  # while jobs are not finished
       if (time.time() - job_start) > maxWait: # check maximum time (user def.)
           pool.terminate()                    # kill old pool
           pool = multiprocessing.pool(args)       # create new pool
           redo = True                         # redo computation
           break                               # break loop, (not finished)
     if not(redo):                             # computation was successful
         result = Jobs.get()                   # get results 
         done = True                           # exit outer while

Another option is to use a timeout on the iterator returned by pool.imap , which can be provided as a parameter to the iterator's 'next' method, next(timeout). If a process exceeds the timeout, then multiprocessing.TimeoutError is raised in the main process and similar actions as explained above, can follow within the except block, although I have not tested this thoroughly.

Apparently, recently they've changed the behaviour in Python 3.3, to raise an exception in this case: http://hg.python.org/cpython/rev/6d6099f7fe89

The defect that lead to this ticket is: http://bugs.python.org/issue9205

However, if you manually spawn the workers (which I usually do when I use multiprocessing), you may try to use the Process.is_alive() function: http://docs.python.org/dev/library/multiprocessing#multiprocessing.Process.is_alive

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM