帶有管理器和異步方法的 multiprocessing.pool

Question

我正在嘗試使用 Manager() 在進程之間共享字典並嘗試了以下代碼：

from multiprocessing import Manager, Pool

def f(d):
    d['x'] += 2

if __name__ == '__main__':
    manager = Manager()
    d = manager.dict()
    d['x'] = 2
    p= Pool(4)

    for _ in range(2000):
        p.map_async(f, (d,))  #apply_async, map

    p.close()
    p.join()

    print (d)  # expects this result --> {'x': 4002}

使用 map_async 和 apply_async，打印的結果總是不同的（例如 {'x': 3838}, {'x': 3770}）。 但是，使用 map 會得到預期的結果。 另外，我嘗試使用 Process 而不是 Pool，結果也不同。

有什么見解嗎？ 非阻塞部分和競態條件不是由經理處理的嗎？

Answer 1

當您調用map （而不是map_async ）時，它將阻塞，直到處理器完成您傳遞的所有請求，在您的情況下，這只是對 function f的一次調用。 因此，即使您的池大小為 4，您實際上也是一次處理 2000 個進程。 要實際並行執行，您應該執行單個p.map(f, [d]*2000)而不是循環。

但是當您調用map_async時，您不會阻塞並返回結果 object。 get結果 object 的調用將阻塞，直到進程完成，並將返回 function 調用的結果。 因此，現在您一次最多運行 4 個進程。 但是字典的更新不會跨處理器序列化。 我修改了代碼以通過使用多處理鎖來強制對d[x] += 2進行序列化。 您將看到結果現在是 4002。

from multiprocessing import Manager, Pool, Lock


def f(d):
    lock.acquire()
    d['x'] += 2
    lock.release()

def init(l):
    global lock
    lock = l

if __name__ == '__main__':
    with Manager() as manager:
        d = manager.dict()
        d['x'] = 2
        lock = Lock()
        p = Pool(4, initializer=init, initargs=(lock,)) # Create the multiprocessing lock that is sharable by all the processes

        results = [] # if the function returnd a result we wanted
        for _ in range(2000):
            results.append(p.map_async(f, (d,)))  #apply_async, map
        """
        for i in range(2000): # if the function returned a result we wanted
            results[i].get() # wait for everything to finish
        """
        p.close()
        p.join()
        print(d)

帶有管理器和異步方法的 multiprocessing.pool

問題描述

1 個解決方案

解決方案1
0 已采納 2020-05-11 20:05:16

帶有管理器和異步方法的 multiprocessing.pool

問題描述

1 個解決方案

解決方案1 0 已采納 2020-05-11 20:05:16

解決方案1
0 已采納 2020-05-11 20:05:16