[英]class attributes and memory shared between Processes in process pool?
I have a class A
that when initiated changes a mutable class attribute nums
. 我有一个A
类,它在启动时改变了一个可变的类属性nums
。
when initiating the class via a Process pool with maxtasksperchild
= 1
, I notice that nums
has the values of several different processes. 当通过maxtasksperchild
= 1
的进程池启动类时,我注意到nums
具有几个不同进程的值。 which is an undesirable behavior for me. 这对我来说是不受欢迎的行为。
my questions are: 我的问题是:
maxtasksperchild
and the workings of a Process pool correctly ? 我不正确理解maxtasksperchild
和进程池的工作原理吗? EDIT : I am guessing that that the pool pickles the previous processes it started (and not the original one) and thus saving the values of nums
, is that correct? 编辑 :我猜这个池泡了它开始的先前进程(而不是原始进程),从而保存了nums
的值,这是正确的吗? and if so, how can i force it to use the original process? 如果是这样,我怎么强迫它使用原始过程?
here is an example code: 这是一个示例代码:
from multiprocessing import Pool
class A:
nums = []
def __init__(self, num=None):
self.__class__.nums.append(num) # I use 'self.__class__' for the sake of explicitly
print(self.__class__.nums)
assert len(self.__class__.nums) < 2 # checking that they don't share memory
if __name__ == '__main__':
with Pool(maxtasksperchild=1) as pool:
pool.map(A, range(99)) # the assert is being raised
EDIT because of answer by k.wahome: using instance attributes doesn't answer my question I need to use class attributes because in my original code (not shown here) i have several instances per process. 编辑,因为k.wahome的回答:使用实例属性不回答我的问题我需要使用类属性,因为在我的原始代码(这里没有显示)我每个进程有几个实例。 my question is specifically about the workings of a multiprocessing pool. 我的问题是关于多处理池的工作原理。
btw, doing the following does work 顺便说一句,做以下工作确实有效
from multiprocessing import Process
if __name__ == '__main__':
prs = []
for i in range(99):
pr = Process(target=A, args=[i])
pr.start()
prs.append(pr)
[pr.join() for pr in prs]
# the assert was not raised
The sharing is most likely coming in via the mapped class A
with a class attribute nums
. 共享很可能是通过带有类属性nums
的映射类A
进入的。
Class attributes are class bound thus belong to the class itself, are created when the class is loaded and they will be shared by all the instances. 类属性是类绑定的,因此属于类本身,在加载类时创建,并且它们将由所有实例共享。 All objects will have the same memory reference to a class attribute. 所有对象都具有与类属性相同的内存引用。
Unlike class attributes, instance attributes are instance bound and not shared by various instances. 与类属性不同,实例属性是实例绑定的,不是由各种实例共享的。 Every instance has its own copy of the instance attribute. 每个实例都有自己的实例属性副本。
See the class vs instance attribute effect: 请参阅类vs实例属性效果:
1. Using nums
as a class attribute class_num.py 1.使用nums
作为类属性 class_num.py
from multiprocessing import Pool
class A:
nums = []
def __init__(self, num=None):
# I use 'self.__class__' for the sake of explicitly
self.__class__.nums.append(num)
print("nums:", self.__class__.nums)
# checking that they don't share memory
assert len(self.__class__.nums) < 2
if __name__ == '__main__':
with Pool(maxtasksperchild=1) as pool:
print(pool)
pool.map(A, range(99)) # the assert is being raised
Running this script 运行此脚本
>>> python class_num.py
nums: [0]
nums: [0, 1]
nums: [4]
nums: [4, 5]
nums: [8]
nums: [8, 9]
nums: [12]
nums: [12, 13]
nums: [16]
nums: [16, 17]
nums: [20]
nums: [20, 21]
nums: [24]
nums: [24, 25]
nums: [28]
nums: [28, 29]
nums: [32]
nums: [32, 33]
nums: [36]
nums: [36, 37]
nums: [40]
nums: [40, 41]
nums: [44]
nums: [44, 45]
nums: [48]
nums: [48, 49]
nums: [52]
nums: [52, 53]
nums: [56]
nums: [56, 57]
nums: [60]
nums: [60, 61]
nums: [64]
nums: [64, 65]
nums: [68]
nums: [68, 69]
nums: [72]
nums: [72, 73]
nums: [76]
nums: [76, 77]
nums: [80]
nums: [80, 81]
nums: [84]
nums: [84, 85]
nums: [88]
nums: [88, 89]
nums: [92]
nums: [92, 93]
nums: [96]
nums: [96, 97]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "class_num.py", line 12, in __init__
assert len(self.__class__.nums) < 2
AssertionError
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "class_num.py", line 18, in <module>
pool.map(A, range(99)) # the assert is being raised
File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 608, in get
raise self._value
AssertionError
2. Using nums
as an instance attribute instance_num.py 2.使用nums
作为实例属性 instance_num.py
from multiprocessing import Pool
class A:
def __init__(self, num=None):
self.nums = []
if num is not None:
self.nums.append(num)
print("nums:", self.nums)
# checking that they don't share memory
assert len(self.nums) < 2
if __name__ == '__main__':
with Pool(maxtasksperchild=1) as pool:
pool.map(A, range(99)) # the assert is being raised
Running this script 运行此脚本
>>> python instance_num.py
nums: [0]
nums: [1]
nums: [2]
nums: [3]
nums: [4]
nums: [5]
nums: [6]
nums: [7]
nums: [8]
nums: [9]
nums: [10]
nums: [11]
nums: [12]
nums: [13]
nums: [14]
nums: [15]
nums: [16]
nums: [17]
nums: [18]
nums: [19]
nums: [20]
nums: [21]
nums: [22]
nums: [23]
nums: [24]
nums: [25]
nums: [26]
nums: [27]
nums: [28]
nums: [29]
nums: [30]
nums: [31]
nums: [32]
nums: [33]
nums: [34]
nums: [35]
nums: [36]
nums: [37]
nums: [38]
nums: [39]
nums: [40]
nums: [41]
nums: [42]
nums: [43]
nums: [44]
nums: [45]
nums: [46]
nums: [47]
nums: [48]
nums: [49]
nums: [50]
nums: [51]
nums: [52]
nums: [53]
nums: [54]
nums: [55]
nums: [56]
nums: [57]
nums: [58]
nums: [59]
nums: [60]
nums: [61]
nums: [62]
nums: [63]
nums: [64]
nums: [65]
nums: [66]
nums: [67]
nums: [68]
nums: [69]
nums: [70]
nums: [71]
nums: [72]
nums: [73]
nums: [74]
nums: [75]
nums: [76]
nums: [77]
nums: [78]
nums: [79]
nums: [80]
nums: [81]
nums: [82]
nums: [83]
nums: [84]
nums: [85]
nums: [86]
nums: [87]
nums: [88]
nums: [89]
nums: [90]
nums: [91]
nums: [92]
nums: [93]
nums: [94]
nums: [95]
nums: [96]
nums: [97]
nums: [98]
Your observation has another reason. 你的观察有另一个原因。 The values in nums
are not from other processes but from the same process when it starts hosting multiple instances of A. This happens because you didn't set chunksize
to 1 in your pool.map
-call. nums
中的值不是来自其他进程,而是来自同一进程,当它开始托管多个A实例时。这是因为您没有在pool.map
中将chunksize
设置为1。 Setting maxtasksperchild=1
is not enough in your case because one task still consumes a whole chunk of the iterable. 在你的情况下设置maxtasksperchild=1
还不够,因为一个任务仍然占用了迭代的一大块。
This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. 此方法将可迭代切换为多个块,并将其作为单独的任务提交给进程池。 The (approximate) size of these chunks can be specified by setting chunksize to a positive integer. 可以通过将chunksize设置为正整数来指定这些块的(近似)大小。 docs about map 关于地图的文档
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.