简体   繁体   中英

python multiprocessing : setting class attribute value

I have an class called Experiment and another called Case. One Experiment is made of many individual cases. See Class definitions below,

from multiprocessing import Process

class Experiment (object):
    def __init__(self, name):
        self.name = name
        self.cases = []
        self.cases.append(Case('a'))
        self.cases.append(Case('b'))
        self.cases.append(Case('c'))

    def sr_execute(self):
        for c in self.cases:  
            c.setVars(6)

class Case(object):
    def __init__(self, name):
        self.name = name
    def setVars(self, var):
        self.var = var

In my Experiment Class, I have a function called sr_execute. This function shows the desired behavior. I am interested in parsing thru all cases and set an attribute for each of the cases. When I run the following code,

if __name__ == '__main__':
    #multiprocessing.freeze_support()
    e = Experiment('exp')
    e.sr_execute()
    for c in e.cases: print c.name, c.var

I get,

a 6
b 6
c 6

This is the desired behavior.

However, I would like to do this in parallel using multiprocessing. To do this, I add a mp_execute() function to the Experiment Class,

    def mp_execute(self):
        processes = []
        for c in self.cases: 
            processes.append(Process(target= c.setVars, args = (6,)))
        [p.start() for p in processes]
        [p.join() for p in processes]

However, this does not work. When I execute the following,

if __name__ == '__main__':
    #multiprocessing.freeze_support()
    e = Experiment('exp')
    e.mp_execute()
    for c in e.cases: print c.name, c.var

I get an error,

AttributeError: 'Case' object has no attribute 'var'

Apparently, I am unable to set class attribute using multiprocessing.

Any clues what is going on,

When you call:

def mp_execute(self):
    processes = []
    for c in self.cases: 
        processes.append(Process(target= c.setVars, args = (6,)))
    [p.start() for p in processes]
    [p.join() for p in processes]

when you create the Process it will use a copy of your object and the modifications to such object are not passed to the main program because different processes have different adress spaces. It would work if you used Thread s since in that case no copy is created.

Also note that your code will probably fail in Windows because you are passing a method as target and Windows requires the target to be picklable (and instance methods are not pickable). The target should be a function defined at the top level of a module in order to work on all Oses.

If you want to communicate to the main process the changes you could:

  • Use a Queue to pass the result
  • Use a Manager to built a shared object

Anyway you must handle the communication "explicitly" either by setting up a "channel" (like a Queue ) or setting up a shared state.


Style note: Do not use list-comprehensions in this way:

[p.join() for p in processes]

it's simply wrong . You are only wasting space creating a list of None s. It is also probably slower compared to the right way:

for p in processes:
    p.join()

Since it has to append the elements to the list.

Some say that list-comprehensions are slightly faster than for loops, however:

  • The difference in performance is so small that it generally doesn't matter
  • They are faster if and only if you consider this kind of loops:

     a = [] for element in something: a.append(element) 

    If the loop, like in this case, does not create a list , then the for loop will be faster.

By the way: some use map in the same way to perform side-effects. This again is wrong because you wont gain much in speed for the same reason as before and it fails completely in python3 where map returns an iterator and hence it will not execute the functions at all, thus making the code less portable.

@Bakuriu's answer offers good styling and efficiency suggestions. And true that each process gets a copy of the master process stack, hence the changes made by forked processes will not be reflected in address space of the master process unless you utilize some form of IPC (eg Queue, Pipe, Manager).

But the particular AttributeError: 'Case' object has no attribute 'var' error that you are getting has an additional reason, namely that your Case objects do not yet have the var attribute at the time you launch your processes. Instead, the var attribute is created in the setVars() method.

Your forked processes do indeed create the variable when they call setVars() (and actually even set it to 6), but alas, this change is only in the copies of Case objects, ie not reflected in the master process's memory space (where the variable still does not exist).

To see what I mean, change your Case class to this:

class Case(object):
    def __init__(self, name):
        self.name = name
        self.var = 7  # Create var in the constructor.
    def setVars(self, var):
        self.var = var

By adding the var member variable in the constructor, your master process will have access to it. Of course, the changes in the forked processes will still not be reflected in the master process, but at least you don't get the error:

a 7
b 7
c 7

Hope this sheds light on what's going on. =)


SOLUTION:

The least-intrusive (to original code) thing to do is use ctypes object from shared memory:

from multiprocessing import Value
class Case(object):
    def __init__(self, name):
        self.name = name
        self.var = Value('i', 7)              # Use ctypes "int" from shared memory.
    def setVars(self, var):
        self.var.value = var                  # Set the variable's "value" attribute.

and change your main() to print c.var.value:

for c in e.cases: print c.name, c.var.value   # Print the "value" attribute.

Now you have the desired output:

a 6
b 6
c 6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM