使用嵌套字典進行多處理

Question

有沒有辦法將嵌套字典傳遞給多處理？

d = {'a': {'x': 1, 'y':100},
     'b': {'x': 2, 'y':200}}

我希望開始兩個並行作業，一個用於{'a': {'x':1, 'y':100}}另一個用於{'b': {'x': 2, 'y':200}} ，並使用以下 function 創建一個新字典

def f(d):
    key = dd.keys()
    new_d[key]['x'] = d[key]['x']*2
    new_d[key]['y'] = d[key]['y']*2

這是我不成功的嘗試

import multiprocessing

def f(key, d, container):
    container[key]['x'] = d[key]['x']*2
    container[key]['y'] = d[key]['y']*2
    
if __name__ == '__main__':
    manager = multiprocessing.Manager()
    container = manager.dict()
    d = manager.dict()
    
    d['a'] = {'x': 1, 'y':100}
    d['b'] = {'x': 2, 'y':200}
        
    p1 = multiprocessing.Process(target=f, args=('a',d, container))
    p2 = multiprocessing.Process(target=f, args=('b',d, container))
    
    p1.start()
    p2.start()
    p1.join()
    p2.join()

我得到一個KeyError: 'b'而且，我想避免手動指定進程的數量，比如p1和p2等等。 有沒有其他方法？

Answer 1

嵌套的字典也必須被管理。 我在您的代碼中添加了這一步，並使所有內容都依賴於d的成員，因此您不必處理p1 、 p2等：

import multiprocessing

def f(key, d, container):
    container[key]['x'] = d[key]['x']*2
    container[key]['y'] = d[key]['y']*2

if __name__ == '__main__':
    manager = multiprocessing.Manager()
    container = manager.dict()
    d = manager.dict()

    d['a'] = {'x': 1, 'y':100}
    d['b'] = {'x': 2, 'y':200}

    # This line initialises the nested dicts
    for key in d:
        container[key] = manager.dict()

    # Here we create a list with the processes we started
    processes = []
    for key in d:
        p = multiprocessing.Process(target=f, args=(key ,d, container))
        p.start()
        processes.append(p)

    # And finally wait for all of them to finish
    for p in processes:
        p.join()

    # Show the results
    print(container['a'])
    print(container['b'])

multiprocessing.Pool class 可能是解決您問題的更好方法（查看文檔）

Answer 2

@nonDucor 是對的：您必須使用Manager object 創建嵌套字典。

這是一個使用更多 Pythonic 字典創建以及使用ProcessPoolExecutor接口進行並發的簡化解決方案：

from concurrent.futures import ProcessPoolExecutor as Executor
import multiprocessing

def f(key, d, container):
    container[key]['x'] = d[key]['x'] * 2
    container[key]['y'] = d[key]['y'] * 2

if __name__ == '__main__':
    manager = multiprocessing.Manager()
    d = manager.dict({
        'a': manager.dict({'x': 1, 'y': 100}),
        'b': manager.dict({'x': 2, 'y': 200}),
    })
    container = manager.dict({x: manager.dict() for x in d.keys()})
    executor = Executor()
    executor.submit(f, 'a', d, container)
    executor.submit(f, 'b', d, container)
    executor.shutdown()

只是為了說明，這里是功能相同的解決方案，但這次使用多線程（通過ThreadPoolExecutor類）而不是multiprocessing 。 請注意，由於 memory 由兩個線程共享，因此無需使用受保護的字典。 普通的 ol' dicts 就可以了：

from collections import defaultdict
from concurrent.futures import ThreadPoolExecutor as Executor
import multiprocessing

def f(key, d, container):
    container[key]['x'] = d[key]['x'] * 2
    container[key]['y'] = d[key]['y'] * 2

if __name__ == '__main__':
    d ={
        'a': {'x': 1, 'y': 100},
        'b': {'x': 2, 'y': 200},
    }
    container = defaultdict(lambda: defaultdict(int))
    executor = Executor()
    executor.submit(f, 'a', d, container)
    executor.submit(f, 'b', d, container)
    executor.shutdown()

    print(d)
    print(container)

使用嵌套字典進行多處理

問題描述

2 個解決方案

解決方案1
0 2021-12-08 17:59:22

解決方案2
0 2021-12-08 22:43:33

使用嵌套字典進行多處理

問題描述

2 個解決方案

解決方案1 0 2021-12-08 17:59:22

解決方案2 0 2021-12-08 22:43:33

解決方案1
0 2021-12-08 17:59:22

解決方案2
0 2021-12-08 22:43:33