简体   繁体   中英

Multiprocessing.Manager() weird behavior with global variables

I have a problem with the multiprocessing.Manager class which have a very weird behavior when the manager objects are global variables.

code 1 :

import multiprocessing
from multiprocessing import Manager

manager = Manager()

list1 = manager.list(range(4))
dict1 = manager.dict({"d":1,"f":2})

def process1(list1,dict1):
    print "process1"
    dict1["3"] = 123
    list1.append(10)

def run():
    print "start"
    global list1
    global dict1

    print "list1",list1
    print "dict1",dict1

if __name__ == '__main__':
    print "start"
    j = multiprocessing.Process(target=process1, args=(list1,dict1))
    j.start()
    j.join()
    run()

Output 1:

start
process1
start
list1 [0, 1, 2, 3, 10]
dict1 {'3': 123, 'd': 1, 'f': 2}

Ok, that means that global variables ̀list1 and dict1 have been modified by process1 .

The problem is that when I try to replace list1 or dict1 it doesn't work!

code 2 :

import multiprocessing
from multiprocessing import Manager

manager = Manager()

list1 = manager.list(range(4))
dict1 = manager.dict({"d":1,"f":2})

def process1(list1,dict1):
    print "process1"
    dict1["3"] = 123
    list1 = manager.list(range(100,104))

def run():
    print "start"
    global list1
    global dict1

    print "list1",list1
    print "dict1",dict1

if __name__ == '__main__':
    print "start"
    j = multiprocessing.Process(target=process1, args=(list1,dict1))
    j.start()
    j.join()
    run()

Output 2:

start
process1
start
list1 [0, 1, 2, 3]
dict1 {'3': 123, 'd': 1, 'f': 2}

Any idea why it returns the initial list [0, 1, 2, 3] instead of [100, 101, 102, 103] ?

While a Manager.list object is shared across processes, the names you bind to that object aren't shared at all - not even if you use the same name in all processes. global means the same binding is seen throughout the module in the process that's running the module (unless overridden in some local scope); it doesn't magically mean more than that just because multiprocessing was imported ;-)

Specifically, the name list1 in the main process has nothing to do with the name list1 in the worker process. Their only relation is that both names happen to be initially bound to a single, shared instance of Manager.list . Which is usually all you want from them. Rebinding the name list1 to some other object, in either process, has no effect on the object the name list1 is bound to in any other process.

So, in your second example, within the worker process the name list1 becomes (re)bound to a new manager.list(range(100,104)) instance. This has no effect whatsoever on the binding of name list1 in the main process. Nor is there any possible way for the worker process to change the bindings of any names in any other process - and it would be a nightmare if that could happen.

You can change shared object values , though. But you appear to already know that. For example, doing

list1[:] = range(100,104)

instead changes no bindings, but replaces the entire content of the shared Manager.list instance (so the main process will see the new list content too not because the names are the same, but because both names are bound to the same object ).

By the way, note that in your process1 function, list1 isn't even a global name anyway. It's the name of one of the function's arguments, so acts like a function-local variable name.

Short course: stop thinking about names, and think instead in terms of objects. Names are never shared across processes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM