简体   繁体   English

Python多进程更改类实例就位

[英]Python multiprocess change class instance in place

I have a list of class instances, and I want to call the same instance method in parallel, use pathos to be able to pickle instance method, The true problem is when I want to change/add an attribute to the instances, it doesn't work, I think this is because the pickling to sub-process is a deep-copy of the inputs. 我有一个类实例的列表,我想并行调用相同的实例方法,使用pathos来腌制实例方法,真正的问题是当我想为实例更改/添加属性时,它没有我认为这是因为对子流程进行酸洗是对输入内容的深层复制。 Anyone has any idea how to solve this? 有人知道如何解决吗? I don't want to change the way of writing the instance method ( such as return a value and put it together later). 我不想更改编写实例方法的方式(例如返回一个值并在以后将其放在一起)。

from joblib import Parallel, delayed
import pathos.multiprocessing as mp
# import multiprocessing as mp 
import random
import os

pool = mp.Pool(mp.cpu_count())

class Person(object):
    def __init__(self, name):
        self.name = name

    def print_name(self, num):
        self.num = num
        print "worker {}, person name {}, received int {}".format(os.getpid(), self.name, self.num)


people = [Person('a'),
          Person('b'),
          Person('c'),
          Person('d'),
          Person('e'),
          Person('f'),
          Person('g'),
          Person('h')]


for i, per in enumerate(people):
    pool.apply_async(Person.print_name, (per, i) )

pool.close()
pool.join()
print 'their number'
for per in people:
    print per.num

This is the output, the num attribute is not found, I think it is because the change is made on those copies. 这是输出,找不到num属性,我认为是因为对这些副本进行了更改。

In [1]: run delme.py
worker 13981, person name a, random int 0
worker 13982, person name b, random int 1
worker 13983, person name c, random int 2
worker 13984, person name d, random int 3
worker 13985, person name e, random int 4
worker 13986, person name f, random int 5
worker 13987, person name g, random int 6
worker 13988, person name h, random int 7
their number
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/chimerahomes/wenhoujx/brain_project/network_analysis/delme.py in <module>()
     39 print 'their number'
     40 for per in people:
---> 41     print per.num

AttributeError: 'Person' object has no attribute 'num'

following suggest in the comments, I try to return self from the child-process, but it seems a pathos bug that the returned self is NOT its original type. 按照注释中的建议,我尝试从子进程返回self,但似乎是一个可悲的错误,即返回的self不是其原始类型。 See the following code: 请参见以下代码:

import pickle
# from joblib import Parallel, delayed
import pathos.multiprocessing as mp
# import multiprocessing as mp 
import random
import os

pool = mp.Pool(mp.cpu_count())

class Person(object):
    def __init__(self, name):
        self.name = name

    def print_name(self, num):
        self.num = num
        print "worker {}, person name {}, received int {}".format(os.getpid(), self.name, self.num)
        # return itself and put everything together
        return self



people = [Person('a'),
          Person('b'),
          Person('c'),
          Person('d'),
          Person('e'),
          Person('f'),
          Person('g'),
          Person('h')]

# Parallel(n_jobs=-1)(delayed(Person.print_name)(per) for per in people)

res = []
for i, per in enumerate(people):
    res.append(pool.apply_async(Person.print_name, (per, i) ))

pool.close()
pool.join()
people = [rr.get() for rr in res]


print 'their number'
for per in people:
    print per.num

print isinstance(people[0], Person)

and this is the output: 这是输出:

In [1]: run delme.py
worker 29963, person name a, received int 0
worker 29962, person name b, received int 1
worker 29964, person name c, received int 2
worker 29962, person name d, received int 3
worker 29966, person name e, received int 4
worker 29967, person name f, received int 5
worker 29966, person name g, received int 6
worker 29967, person name h, received int 7
their number
0
1
2
3
4
5
6
7
False

I use the default multiprocessing package, and it has no such problem. 我使用默认的多处理程序包,它没有这样的问题。

The problem is that self.num is a assigned in the child process. 问题在于self.num是在子进程中分配的。 multiprocessing does not pass the original object back to the caller. 多重处理不会将原始对象传递回调用方。 It does pass the method's return code back. 它确实将方法的返回代码传回。 So, you could pass num back directly or even self (but that is generally inefficient and doesn't replace the existing object in the parent, just creates a new one). 因此,您可以直接将num甚至是self传递回来(但这通常效率不高,并且不会替换父级中的现有对象,而只是创建一个新对象)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM