简体   繁体   English

一旦任何进程遇到错误,如何终止 python 中的异步星图多处理池

[英]How do I terminate an asynchronous starmap multiprocessing pool in python once any process has encountered an error

I'm using the starmap_async function from Python's multiprocessing library.我正在使用 Python 多处理库中的starmap_async function。 However, I noticed that if my code encounters an error in one of the processes, the exception is not thrown until all processes have finished.但是,我注意到如果我的代码在其中一个进程中遇到错误,则在所有进程完成之前不会引发异常。 Here's the relevant code:以下是相关代码:

from multiprocessing import Pool, cpu_count
import datetime
import itertools
import time

       with Pool(max(cpu_count()//2, 1)) as p:
            #p = Pool()
            df_iter = df_options.iterrows()
            ir = itertools.repeat
            results = p.starmap_async(_run,zip(df_iter,ir(fixed_options),ir(outputs_grab)), chunksize=1)
            p.close() #no more jobs to submit
            
            #Printing progress
            
            n_remaining = results._number_left + 1
            while (not results.ready()):
                time.sleep(1)
                #Check for errors here ... How ????
                #Then what? call terminate()????? 
                if verbose:
                   if results._number_left < n_remaining:
                      now = datetime.datetime.now()
                      n_remaining = results._number_left
                      print('%d/%d  %s' % (n_remaining,n_rows,str(now)[11:]))
                    
            print('joining')
            p.join()
            
            all_results = results.get()
            
        df = pd.DataFrame(all_results)

Currently if I raise an error in the spawned processes it appears that other processes not only finish running, but start new tasks despite there being an error from one of the calls.目前,如果我在生成的进程中引发错误,其他进程似乎不仅完成运行,而且启动新任务,尽管其中一个调用出现错误。

Some searching is leading me to believe that this may not be possible.一些搜索让我相信这可能是不可能的。 One person seemed to suggest I might need to use concurrent.futures instead, although it is unclear how to map my example to that example, especially with keeping the real time feedback as processes finish.一个人似乎建议我可能需要使用concurrent.futures来代替,尽管目前尚不清楚如何将我的示例 map 转换为该示例,尤其是在流程完成时保持实时反馈。

discussion of concurrent.futures : https://stackoverflow.com/a/47108581/764365 concurrent.futures的讨论: https://stackoverflow.com/a/47108581/764365

tldr; tldr; Use imap_unordered to have the smallest latency to the main process knowing about a child throwing an exception as it allows you to process results in the main process as soon as they come in via the results Queue .使用imap_unordered可以使主进程的延迟最小,因为它允许您在结果通过结果Queue进入时立即在主进程中处理结果。 You can then use a wrapper function to build your own "star" version of the function if you desire.然后,您可以根据需要使用包装器 function 来构建您自己的“星”版 function。 As a point of code design, most Pool methods tend to re-raise exceptions from the child whereas concurrent.futures tends to set an attribute of the return value to indicate the exception that was raised.作为代码设计的一点,大多数Pool方法倾向于重新引发子异常,而concurrent.futures倾向于设置返回值的属性以指示引发的异常。

from random import random
from functools import partial
from multiprocessing import Pool
from time import sleep

def foo(a, b):
    sleep(random()) #introduce some processing delay to simulate work
    if random() > .95:
        raise Exception("randomly rasied an exception")
    else:
        return f"{a}\t{b}"

def star_helper(func, args):
    return func(*args)

if __name__ == "__main__":
    n = 20
    print("chance of early termination:", (1-.95**n)*100, "%")
    with Pool() as p:
        try:
            for result in p.imap_unordered(partial(star_helper, foo), zip(range(n), range(n))):
                print(result)
        except:
            p.terminate()
            print("terminated")
    print("done") # `with Pool()` joins the child processes to prove they quit early

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM