简体   繁体   English

在 Python multiprocessing.Pool 中共享可变全局变量

[英]Sharing mutable global variable in Python multiprocessing.Pool

I'm trying to update a shared object (a dict ) using the following code.我正在尝试使用以下代码更新共享的 object (一个dict )。 But it does not work.但它不起作用。 It gives me the input dict as an output.它给了我作为 output 的输入dict

Edit : Exxentially, What I'm trying to achieve here is to append items in the data (a list) to the dict's list.编辑:Exxentially,我在这里想要实现的是将数据(列表)中的 append 项添加到字典列表中。 Data items give indices in the dict.数据项在字典中给出索引。

Expected output : {'2': [2], '1': [1, 4, 6], '3': [3, 5]}预期 output : {'2': [2], '1': [1, 4, 6], '3': [3, 5]}
Note: Approach 2 raise error TypeError: 'int' object is not iterable注意:方法 2 引发错误TypeError: 'int' object is not iterable

  1. Approach 1方法一

    from multiprocessing import * def mapTo(d,tree): for idx, item in enumerate(list(d), start=1): tree[str(item)].append(idx) data=[1,2,3,1,3,1] manager = Manager() sharedtree= manager.dict({"1":[],"2":[],"3":[]}) with Pool(processes=3) as pool: pool.starmap(mapTo, [(data,sharedtree ) for _ in range(3)])
  2. Approach 2方法二
 from multiprocessing import *
 def mapTo(d):
         global tree
         for idx, item in enumerate(list(d), start=1):
             tree[str(item)].append(idx)

 def initializer():
      global tree
      tree = dict({"1":[],"2":[],"3":[]})
 data=[1,2,3,1,3,1]
 with Pool(processes=3, initializer=initializer, initargs=()) as pool:
     pool.map(mapTo,data)```

You need to use managed lists if you want the changes to be reflected.如果要反映更改,则需要使用托管列表。 So, the following works for me:所以,以下对我有用:

from multiprocessing import *
def mapTo(d,tree):
        for idx, item in enumerate(list(d), start=1):
            tree[str(item)].append(idx)

if __name__ == '__main__':
    data=[1,2,3,1,3,1]

    with Pool(processes=3) as pool:
        manager = Manager()
        sharedtree= manager.dict({"1":manager.list(), "2":manager.list(),"3":manager.list()})
        pool.starmap(mapTo, [(data,sharedtree ) for _ in range(3)])

    print({k:list(v) for k,v in sharedtree.items()})

This is the ouput:这是输出:

{'1': [1, 1, 1, 4, 4, 4, 6, 6, 6], '2': [2, 2, 2], '3': [3, 3, 5, 3, 5, 5]}

Note, you should always use the if __name__ == '__main__': guard when using multiprocessing, also, avoid starred imports...请注意,在使用多处理时,您应该始终使用if __name__ == '__main__':守卫,另外,避免带星号的导入...

Edit编辑

You have to do this re-assignment if you are on Python < 3.6, so use this for mapTo :如果您在 Python < 3.6 上,则必须重新分配,因此将其用于mapTo

def mapTo(d,tree):
        for idx, item in enumerate(list(d), start=1):
            l = tree[str(item)]
            l.append(idx)
            tree[str(item)] = l

And finally, you aren't using starmap / map correctly, you are passing the data three times, so of course, everything gets counted three times.最后,您map starmap您将数据传递了三次,所以当然,一切都被计算了三次。 A mapping operation should work on each individual element of the data you are mapping over, so you want something like:映射操作应该适用于您要映射的数据的每个单独元素,因此您需要以下内容:

from functools import partial
from multiprocessing import *
def mapTo(i_d,tree):
    idx,item = i_d
    l = tree[str(item)]
    l.append(idx)
    tree[str(item)] = l

if __name__ == '__main__':
    data=[1,2,3,1,3,1]

    with Pool(processes=3) as pool:
        manager = Manager()
        sharedtree= manager.dict({"1":manager.list(), "2":manager.list(),"3":manager.list()})
        pool.map(partial(mapTo, tree=sharedtree), list(enumerate(data, start=1)))

    print({k:list(v) for k,v in sharedtree.items()})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM