如何添加可用於多處理隊列的進程池

Question

我現在使用的代碼：

import multiprocessing


class MyFancyClass:

    def __init__(self, name):
        self.name = name

    def do_something(self):
        proc_name = multiprocessing.current_process().name
        print('Doing something fancy in {} for {}!'.format(proc_name, self.name))


def worker(q):
    while True:
        obj = q.get()
        if obj is None:
            break
        obj.do_something()


if __name__ == '__main__':
    queue = multiprocessing.Queue()

    p = multiprocessing.Process(target=worker, args=(queue,))
    p.start()

    queue.put(MyFancyClass('Fancy Dan'))
    queue.put(MyFancyClass('Frankie'))
    # print(queue.qsize())
    queue.put(None)

    # Wait for the worker to finish
    queue.close()
    queue.join_thread()
    p.join()

現在，隊列中有兩個項目。 如果我將兩行替換為例如50個項目的列表...。如何啟動POOL以允許進行許多處理。 例如：

p = multiprocessing.Pool(processes=4)

那去哪兒了？ 我希望能夠一次運行多個項目，尤其是當項目運行一會兒時。 謝謝！

Answer 1

通常，您可以使用Pool 或 Process加上Queue 。 兩者混用是一種誤用。 Pool已經在后台使用了Queue （或類似的機制）。

如果要使用Pool來執行此操作，請將代碼更改為（將代碼移至main功能以實現性能並比在全局范圍內運行更好地清理資源）：

def main():
    myfancyclasses = [MyFancyClass('Fancy Dan'), ...] # define your MyFancyClass instances here
    with multiprocessing.Pool(processes=4) as p:
        # Submit all the work
        futures = [p.apply_async(fancy.do_something) for fancy in myfancyclasses]

        # Done submitting, let workers exit as they run out of work
        p.close()

        # Wait until all the work is finished
        for f in futures:
            f.wait()

if __name__ == '__main__':
    main()

可以使用Pool的.*map*方法進一步以純度為代價來簡化此操作，例如，以最小化內存使用，將main重新定義為：

def main():
    myfancyclasses = [MyFancyClass('Fancy Dan'), ...] # define your MyFancyClass instances here
    with multiprocessing.Pool(processes=4) as p:
        # No return value, so we ignore it, but we need to run out the result
        # or the work won't be done
        for _ in p.imap_unordered(MyFancyClass.do_something, myfancyclasses):
            pass

是的，從技術上講，在需要序列化未使用的返回值方面，這兩種方法的開銷都會稍高一些，因此請將其返回給父進程。 但是在實踐中，此開銷非常低（由於您的函數沒有return ，因此返回None ，序列化為幾乎沒有內容）。 這種方法的優點是，要在屏幕上打印，通常不希望從子進程中進行打印（因為它們最終將交錯輸出），並且可以用return替換print以使父母做這項工作，例如：

import multiprocessing

class MyFancyClass:
    def __init__(self, name):
        self.name = name

    def do_something(self):
        proc_name = multiprocessing.current_process().name
        # Changed from print to return
        return 'Doing something fancy in {} for {}!'.format(proc_name, self.name)

def main():
    myfancyclasses = [MyFancyClass('Fancy Dan'), ...] # define your MyFancyClass instances here
    with multiprocessing.Pool(processes=4) as p:
        # Using the return value now to avoid interleaved output
        for res in p.imap_unordered(MyFancyClass.do_something, myfancyclasses):
            print(res)

if __name__ == '__main__':
    main()

注意所有這些解決方案如何消除編寫自己的worker函數或手動管理Queue的需要，因為Pool可以為您完成繁重的工作。

一種替代方法，使用concurrent.futures 。未來功能可在結果可用時有效地對其進行處理，同時允許您在進行過程中選擇提交新工作（基於結果或基於外部信息）：

import concurrent.futures

from concurrent.futures import FIRST_COMPLETED

def main():
    allow_new_work = True  # Set to False to indicate we'll no longer allow new work
    myfancyclasses = [MyFancyClass('Fancy Dan'), ...] # define your initial MyFancyClass instances here
    with concurrent.futures.ProcessPoolExecutor() as executor:
        remaining_futures = {executor.submit(fancy.do_something)
                             for fancy in myfancyclasses}
        while remaining_futures:
            done, remaining_futures = concurrent.futures.wait(remaining_futures,
                                                              return_when=FIRST_COMPLETED)
            for fut in done:
                result = fut.result()
                # Do stuff with result, maybe submit new work in response

            if allow_new_work:
                if should_stop_checking_for_new_work():
                    allow_new_work = False
                    # Let the workers exit when all remaining tasks done,
                    # and reject submitting more work from now on
                    executor.shutdown(wait=False)
                elif has_more_work():
                    # Assumed to return collection of new MyFancyClass instances
                    new_fanciness = get_more_fanciness()
                    remaining_futures |= {executor.submit(fancy.do_something)
                                          for fancy in new_fanciness}
                    myfancyclasses.extend(new_fanciness)

如何添加可用於多處理隊列的進程池

問題描述

1 個解決方案

解決方案1
0 2019-06-20 14:58:43

如何添加可用於多處理隊列的進程池

問題描述

1 個解決方案

解決方案1 0 2019-06-20 14:58:43

解決方案1
0 2019-06-20 14:58:43