简体   繁体   English

python 多处理 - 在进程之间共享类字典,随后从进程写入反映到共享 memory

[英]python multiprocessing - Sharing a dictionary of classes between processes with subsequent writes from the process reflected to the shared memory

Problem问题

I need to share a dictionary between processes that contains an instance of a class inside of the value component of the key-value pair.我需要在包含键值对的值组件内部的 class 实例的进程之间共享一个字典。 The dictionary created using multiprocessing's dict() from the manager class is able to store values, but subsequent writes to update the values aren't reflected to the shared memory.使用来自管理器 class 的多处理的 dict() 创建的字典能够存储值,但更新值的后续写入不会反映到共享的 memory。

What I've tried我试过的

To attempt to solve this problem, I know I have to use a dict() created by a manager from python's multiprocessing library so that it can be shared between processes.为了尝试解决这个问题,我知道我必须使用由 python 的多处理库中的管理器创建的 dict() ,以便它可以在进程之间共享。 This works with simple values likes integers and strings.这适用于简单的值,如整数和字符串。 However, I had hoped that the created dictionary would handle deeper levels of synchronization for me so I could just create a class inside of the dictionary and that change would be reflected, but it seems multiprocessing is much more complicated than that.但是,我曾希望创建的字典能为我处理更深层次的同步,所以我可以在字典内创建一个 class 并反映这种变化,但似乎多处理比这复杂得多。

Example例子

Below I have provided an example program that doesn't work as intended.下面我提供了一个无法按预期工作的示例程序。 The printed values aren't what they were set to be inside of the worker function f().打印的值不是它们在工作程序 function f() 中设置的值。

Note: I am using python3 for this example注意:我在这个例子中使用 python3

from multiprocessing import Manager
import multiprocessing as mp
import random


class ExampleClass:
    def __init__(self, stringVar):
        # these variables aren't saved across processes?
        self.stringVar = stringVar
        self.count = 0


class ProcessContainer(object):
    processes = []

    def __init__(self, *args, **kwargs):
        manager = Manager()
        self.dict = manager.dict()

    def f(self, dict):
        # generate a random index to add the class to
        index = str(random.randint(0, 100))

        # create a new class at that index
        dict[index] = ExampleClass(str(random.randint(100, 200)))

        # this is the problem, it doesn't share the updated variables in the dictionary between the processes <----------------------
        # attempt to change the created variables
        dict[index].count += 1
        dict[index].stringVar = "yeAH"

        # print what's inside
        for x in dict.values():
            print(x.count, x.stringVar)

    def Run(self):
        # create the processes
        for str in range(3):
            p = mp.Process(target=self.f, args=(self.dict,))
            self.processes.append(p)

        # start the processes
        [proc.start() for proc in self.processes]

        # wait for the processes to finish
        [proc.join() for proc in self.processes]


if __name__ == '__main__':
    test = ProcessContainer()
    test.Run()

This is a "gotcha" that holds a lot of surprises for the uninitiated.这是一个“陷阱”,给外行人带来了很多惊喜。 The problem is that when you have a managed dictionary, to see updates propagated you need to change a key or a value of a key.问题是,当您拥有托管字典时,要查看传播的更新,您需要更改键或键的值。 Here technically you have not changed the value, that is, you are still referencing the same object instance (type ExampleClass ) and are only changing something within that reference.从技术上讲,这里您没有更改值,也就是说,您仍然引用相同的 object 实例(类型ExampleClass )并且只是该引用中更改了某些内容。 Bizarre, I know.很奇怪,我知道。 This is the modified method f that you need:这是您需要的修改后的方法f

def f(self, dict):
    # generate a random index to add the class to
    index = str(random.randint(0, 100))

    # create a new class at that index
    dict[index] = ExampleClass(str(random.randint(100, 200)))

    # this is the problem, it doesn't share the updated variables in the dictionary between the processes <----------------------
    # attempt to change the created variables
    ec = dict[index]
    ec.count += 1
    ec.stringVar = "yeAH"
    dict[index] = ec # show new reference
    # print what's inside
    for x in dict.values():
        print(x.count, x.stringVar)

Note:笔记:

Had you used the following code to set the key/pair values, the following would actually print False :如果您使用以下代码设置键/对值,则以下内容实际上会打印False

ec = ExampleClass(str(random.randint(100, 200)))
dict[index] = ec
print(dict[index] is ec)

This is why in the modifed method f , dict[index] = ec # show new reference appears to be a new reference being set as the value.这就是为什么在修改的方法f中, dict[index] = ec # show new reference似乎是一个被设置为值的新引用。

Also, you should consider not using dict , a builtin data type, as a variable name.此外,您应该考虑不使用内置数据类型dict作为变量名。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM