简体   繁体   English

python多处理:设置类属性值

[英]python multiprocessing : setting class attribute value

I have an class called Experiment and another called Case. 我有一个名为Experiment的类,另一个名为Case。 One Experiment is made of many individual cases. 一个实验由许多个案组成。 See Class definitions below, 见下面的类定义,

from multiprocessing import Process

class Experiment (object):
    def __init__(self, name):
        self.name = name
        self.cases = []
        self.cases.append(Case('a'))
        self.cases.append(Case('b'))
        self.cases.append(Case('c'))

    def sr_execute(self):
        for c in self.cases:  
            c.setVars(6)

class Case(object):
    def __init__(self, name):
        self.name = name
    def setVars(self, var):
        self.var = var

In my Experiment Class, I have a function called sr_execute. 在我的实验课中,我有一个名为sr_execute的函数。 This function shows the desired behavior. 此功能显示所需的行为。 I am interested in parsing thru all cases and set an attribute for each of the cases. 我有兴趣解析所有案例并为每个案例设置一个属性。 When I run the following code, 当我运行以下代码时,

if __name__ == '__main__':
    #multiprocessing.freeze_support()
    e = Experiment('exp')
    e.sr_execute()
    for c in e.cases: print c.name, c.var

I get, 我明白了

a 6
b 6
c 6

This is the desired behavior. 这是期望的行为。

However, I would like to do this in parallel using multiprocessing. 但是,我想使用多处理并行执行此操作。 To do this, I add a mp_execute() function to the Experiment Class, 为此,我向实验类添加了一个mp_execute()函数,

    def mp_execute(self):
        processes = []
        for c in self.cases: 
            processes.append(Process(target= c.setVars, args = (6,)))
        [p.start() for p in processes]
        [p.join() for p in processes]

However, this does not work. 但是,这不起作用。 When I execute the following, 当我执行以下操作时

if __name__ == '__main__':
    #multiprocessing.freeze_support()
    e = Experiment('exp')
    e.mp_execute()
    for c in e.cases: print c.name, c.var

I get an error, 我收到一个错误,

AttributeError: 'Case' object has no attribute 'var'

Apparently, I am unable to set class attribute using multiprocessing. 显然,我无法使用多处理设置类属性。

Any clues what is going on, 发生了什么事情的线索,

When you call: 你打电话的时候:

def mp_execute(self):
    processes = []
    for c in self.cases: 
        processes.append(Process(target= c.setVars, args = (6,)))
    [p.start() for p in processes]
    [p.join() for p in processes]

when you create the Process it will use a copy of your object and the modifications to such object are not passed to the main program because different processes have different adress spaces. 当您创建Process ,它将使用您的对象的副本 ,并且对此类对象的修改不会传递给主程序,因为不同的进程具有不同的地址空间。 It would work if you used Thread s since in that case no copy is created. 如果您使用Thread s,它将起作用,因为在这种情况下不会创建副本。

Also note that your code will probably fail in Windows because you are passing a method as target and Windows requires the target to be picklable (and instance methods are not pickable). 另请注意,您的代码可能在Windows中失败,因为您将方法作为target传递,而Windows要求target可被选择(并且实例方法可选)。 The target should be a function defined at the top level of a module in order to work on all Oses. target应该是在模块顶层定义的函数,以便在所有Oses上工作。

If you want to communicate to the main process the changes you could: 如果您想与主进程通信,您可以进行以下更改:

  • Use a Queue to pass the result 使用Queue传递结果
  • Use a Manager to built a shared object 使用Manager构建共享对象

Anyway you must handle the communication "explicitly" either by setting up a "channel" (like a Queue ) or setting up a shared state. 无论如何,您必须通过设置“通道”(如Queue )或设置共享状态来“明确”处理通信。


Style note: Do not use list-comprehensions in this way: 样式注意:以这种方式使用列表内涵:

[p.join() for p in processes]

it's simply wrong . 这完全是错的 You are only wasting space creating a list of None s. 你只是浪费空间创建列表None秒。 It is also probably slower compared to the right way: 正确的方式相比,它也可能更慢:

for p in processes:
    p.join()

Since it has to append the elements to the list. 因为它必须将元素附加到列表中。

Some say that list-comprehensions are slightly faster than for loops, however: 有人说,列表内涵是比稍快for循环,但是:

  • The difference in performance is so small that it generally doesn't matter 性能差异很小,通常无关紧要
  • They are faster if and only if you consider this kind of loops: 当且仅当您考虑这种循环时,它们会更快:

     a = [] for element in something: a.append(element) 

    If the loop, like in this case, does not create a list , then the for loop will be faster. 如果循环,像在这种情况下, 不会创建一个list ,然后for循环会更快。

By the way: some use map in the same way to perform side-effects. 顺便说一句:有些人以相同的方式使用map来执行副作用。 This again is wrong because you wont gain much in speed for the same reason as before and it fails completely in python3 where map returns an iterator and hence it will not execute the functions at all, thus making the code less portable. 这又是错误的,因为你不会因为和以前一样的原因获得很多速度而且它在python3中完全失败,其中map返回一个迭代器,因此它根本不会执行函数,从而使代码不那么便携。

@Bakuriu's answer offers good styling and efficiency suggestions. @ Bakuriu的回答提供了良好的造型和效率建议。 And true that each process gets a copy of the master process stack, hence the changes made by forked processes will not be reflected in address space of the master process unless you utilize some form of IPC (eg Queue, Pipe, Manager). 确实,每个进程都获得主进程堆栈的副本 ,因此除非您使用某种形式的IPC(例如Queue,Pipe,Manager),否则分叉进程所做的更改不会反映在主进程的地址空间中。

But the particular AttributeError: 'Case' object has no attribute 'var' error that you are getting has an additional reason, namely that your Case objects do not yet have the var attribute at the time you launch your processes. 但具体的AttributeError: 'Case' object has no attribute 'var' ,你得到了一个额外的理由,即您的案例对象还没有错误var在您启动过程中的时间属性。 Instead, the var attribute is created in the setVars() method. 相反, var属性是在setVars()方法中创建的。

Your forked processes do indeed create the variable when they call setVars() (and actually even set it to 6), but alas, this change is only in the copies of Case objects, ie not reflected in the master process's memory space (where the variable still does not exist). 你的分叉进程在调用setVars()确实创建了变量 (实际上甚至将它设置为6),但是,这种变化只在Case对象的副本中,即没有反映在主进程的内存空间中(变量仍然不存在)。

To see what I mean, change your Case class to this: 要了解我的意思,请将Case类更改为:

class Case(object):
    def __init__(self, name):
        self.name = name
        self.var = 7  # Create var in the constructor.
    def setVars(self, var):
        self.var = var

By adding the var member variable in the constructor, your master process will have access to it. 通过在构造函数中添加var成员变量,您的主进程将可以访问它。 Of course, the changes in the forked processes will still not be reflected in the master process, but at least you don't get the error: 当然,分叉进程中的更改仍然不会反映在主进程中,但至少您不会收到错误:

a 7
b 7
c 7

Hope this sheds light on what's going on. 希望这能够揭示正在发生的事情。 =) =)


SOLUTION: 解:

The least-intrusive (to original code) thing to do is use ctypes object from shared memory: 最少侵入(对原始代码)要做的是使用共享内存中的ctypes对象:

from multiprocessing import Value
class Case(object):
    def __init__(self, name):
        self.name = name
        self.var = Value('i', 7)              # Use ctypes "int" from shared memory.
    def setVars(self, var):
        self.var.value = var                  # Set the variable's "value" attribute.

and change your main() to print c.var.value: 并更改您的main()以打印c.var.value:

for c in e.cases: print c.name, c.var.value   # Print the "value" attribute.

Now you have the desired output: 现在您有了所需的输出:

a 6
b 6
c 6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM