[英]How do I terminate an asynchronous starmap multiprocessing pool in python once any process has encountered an error
I'm using the starmap_async
function from Python's multiprocessing library.我正在使用 Python 多处理库中的
starmap_async
function。 However, I noticed that if my code encounters an error in one of the processes, the exception is not thrown until all processes have finished.但是,我注意到如果我的代码在其中一个进程中遇到错误,则在所有进程完成之前不会引发异常。 Here's the relevant code:
以下是相关代码:
from multiprocessing import Pool, cpu_count
import datetime
import itertools
import time
with Pool(max(cpu_count()//2, 1)) as p:
#p = Pool()
df_iter = df_options.iterrows()
ir = itertools.repeat
results = p.starmap_async(_run,zip(df_iter,ir(fixed_options),ir(outputs_grab)), chunksize=1)
p.close() #no more jobs to submit
#Printing progress
n_remaining = results._number_left + 1
while (not results.ready()):
time.sleep(1)
#Check for errors here ... How ????
#Then what? call terminate()?????
if verbose:
if results._number_left < n_remaining:
now = datetime.datetime.now()
n_remaining = results._number_left
print('%d/%d %s' % (n_remaining,n_rows,str(now)[11:]))
print('joining')
p.join()
all_results = results.get()
df = pd.DataFrame(all_results)
Currently if I raise an error in the spawned processes it appears that other processes not only finish running, but start new tasks despite there being an error from one of the calls.目前,如果我在生成的进程中引发错误,其他进程似乎不仅完成运行,而且启动新任务,尽管其中一个调用出现错误。
Some searching is leading me to believe that this may not be possible.一些搜索让我相信这可能是不可能的。 One person seemed to suggest I might need to use
concurrent.futures
instead, although it is unclear how to map my example to that example, especially with keeping the real time feedback as processes finish.一个人似乎建议我可能需要使用
concurrent.futures
来代替,尽管目前尚不清楚如何将我的示例 map 转换为该示例,尤其是在流程完成时保持实时反馈。
discussion of concurrent.futures
: https://stackoverflow.com/a/47108581/764365 concurrent.futures
的讨论: https://stackoverflow.com/a/47108581/764365
tldr; tldr; Use
imap_unordered
to have the smallest latency to the main process knowing about a child throwing an exception as it allows you to process results in the main process as soon as they come in via the results Queue
.使用
imap_unordered
可以使主进程的延迟最小,因为它允许您在结果通过结果Queue
进入时立即在主进程中处理结果。 You can then use a wrapper function to build your own "star" version of the function if you desire.然后,您可以根据需要使用包装器 function 来构建您自己的“星”版 function。 As a point of code design, most
Pool
methods tend to re-raise exceptions from the child whereas concurrent.futures
tends to set an attribute of the return value to indicate the exception that was raised.作为代码设计的一点,大多数
Pool
方法倾向于重新引发子异常,而concurrent.futures
倾向于设置返回值的属性以指示引发的异常。
from random import random
from functools import partial
from multiprocessing import Pool
from time import sleep
def foo(a, b):
sleep(random()) #introduce some processing delay to simulate work
if random() > .95:
raise Exception("randomly rasied an exception")
else:
return f"{a}\t{b}"
def star_helper(func, args):
return func(*args)
if __name__ == "__main__":
n = 20
print("chance of early termination:", (1-.95**n)*100, "%")
with Pool() as p:
try:
for result in p.imap_unordered(partial(star_helper, foo), zip(range(n), range(n))):
print(result)
except:
p.terminate()
print("terminated")
print("done") # `with Pool()` joins the child processes to prove they quit early
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.