![](/img/trans.png)
[英]Python Multiprocessing: How to add or change number of processes in a pool
[英]How to add a pool of processes available for a multiprocessing queue
我在這里關注前面的問題: 在運行腳本時如何將更多項目添加到多處理隊列中
我現在使用的代碼:
import multiprocessing
class MyFancyClass:
def __init__(self, name):
self.name = name
def do_something(self):
proc_name = multiprocessing.current_process().name
print('Doing something fancy in {} for {}!'.format(proc_name, self.name))
def worker(q):
while True:
obj = q.get()
if obj is None:
break
obj.do_something()
if __name__ == '__main__':
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(queue,))
p.start()
queue.put(MyFancyClass('Fancy Dan'))
queue.put(MyFancyClass('Frankie'))
# print(queue.qsize())
queue.put(None)
# Wait for the worker to finish
queue.close()
queue.join_thread()
p.join()
現在,隊列中有兩個項目。 如果我將兩行替換為例如50個項目的列表...。如何啟動POOL以允許進行許多處理。 例如:
p = multiprocessing.Pool(processes=4)
那去哪兒了? 我希望能夠一次運行多個項目,尤其是當項目運行一會兒時。 謝謝!
通常,您可以使用Pool
或 Process
加上Queue
。 兩者混用是一種誤用。 Pool
已經在后台使用了Queue
(或類似的機制)。
如果要使用Pool
來執行此操作,請將代碼更改為(將代碼移至main
功能以實現性能並比在全局范圍內運行更好地清理資源):
def main():
myfancyclasses = [MyFancyClass('Fancy Dan'), ...] # define your MyFancyClass instances here
with multiprocessing.Pool(processes=4) as p:
# Submit all the work
futures = [p.apply_async(fancy.do_something) for fancy in myfancyclasses]
# Done submitting, let workers exit as they run out of work
p.close()
# Wait until all the work is finished
for f in futures:
f.wait()
if __name__ == '__main__':
main()
可以使用Pool
的.*map*
方法進一步以純度為代價來簡化此操作,例如,以最小化內存使用,將main
重新定義為:
def main():
myfancyclasses = [MyFancyClass('Fancy Dan'), ...] # define your MyFancyClass instances here
with multiprocessing.Pool(processes=4) as p:
# No return value, so we ignore it, but we need to run out the result
# or the work won't be done
for _ in p.imap_unordered(MyFancyClass.do_something, myfancyclasses):
pass
是的,從技術上講,在需要序列化未使用的返回值方面,這兩種方法的開銷都會稍高一些,因此請將其返回給父進程。 但是在實踐中,此開銷非常低(由於您的函數沒有return
,因此返回None
,序列化為幾乎沒有內容)。 這種方法的優點是,要在屏幕上打印,通常不希望從子進程中進行打印(因為它們最終將交錯輸出),並且可以用return
替換print
以使父母做這項工作,例如:
import multiprocessing
class MyFancyClass:
def __init__(self, name):
self.name = name
def do_something(self):
proc_name = multiprocessing.current_process().name
# Changed from print to return
return 'Doing something fancy in {} for {}!'.format(proc_name, self.name)
def main():
myfancyclasses = [MyFancyClass('Fancy Dan'), ...] # define your MyFancyClass instances here
with multiprocessing.Pool(processes=4) as p:
# Using the return value now to avoid interleaved output
for res in p.imap_unordered(MyFancyClass.do_something, myfancyclasses):
print(res)
if __name__ == '__main__':
main()
注意所有這些解決方案如何消除編寫自己的worker
函數或手動管理Queue
的需要,因為Pool
可以為您完成繁重的工作。
一種替代方法,使用concurrent.futures
。未來功能可在結果可用時有效地對其進行處理,同時允許您在進行過程中選擇提交新工作(基於結果或基於外部信息):
import concurrent.futures
from concurrent.futures import FIRST_COMPLETED
def main():
allow_new_work = True # Set to False to indicate we'll no longer allow new work
myfancyclasses = [MyFancyClass('Fancy Dan'), ...] # define your initial MyFancyClass instances here
with concurrent.futures.ProcessPoolExecutor() as executor:
remaining_futures = {executor.submit(fancy.do_something)
for fancy in myfancyclasses}
while remaining_futures:
done, remaining_futures = concurrent.futures.wait(remaining_futures,
return_when=FIRST_COMPLETED)
for fut in done:
result = fut.result()
# Do stuff with result, maybe submit new work in response
if allow_new_work:
if should_stop_checking_for_new_work():
allow_new_work = False
# Let the workers exit when all remaining tasks done,
# and reject submitting more work from now on
executor.shutdown(wait=False)
elif has_more_work():
# Assumed to return collection of new MyFancyClass instances
new_fanciness = get_more_fanciness()
remaining_futures |= {executor.submit(fancy.do_something)
for fancy in new_fanciness}
myfancyclasses.extend(new_fanciness)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.