简体   繁体   English

进程池中进程之间共享的类属性和内存?

[英]class attributes and memory shared between Processes in process pool?

I have a class A that when initiated changes a mutable class attribute nums . 我有一个A类,它在启动时改变了一个可变的类属性nums

when initiating the class via a Process pool with maxtasksperchild = 1 , I notice that nums has the values of several different processes. 当通过maxtasksperchild = 1的进程池启动类时,我注意到nums具有几个不同进程的值。 which is an undesirable behavior for me. 这对我来说是不受欢迎的行为。

my questions are: 我的问题是:

  • are the processes sharing memory ? 共享内存的进程是什么?
  • am i not understanding maxtasksperchild and the workings of a Process pool correctly ? 我不正确理解maxtasksperchild和进程池的工作原理吗?

EDIT : I am guessing that that the pool pickles the previous processes it started (and not the original one) and thus saving the values of nums , is that correct? 编辑 :我猜这个池泡了它开始的先前进程(而不是原始进程),从而保存了nums的值,这是正确的吗? and if so, how can i force it to use the original process? 如果是这样,我怎么强迫它使用原始过程?

here is an example code: 这是一个示例代码:

from multiprocessing import Pool


class A:
    nums = []

    def __init__(self, num=None):
        self.__class__.nums.append(num)  # I use 'self.__class__' for the sake of explicitly
        print(self.__class__.nums)
        assert len(self.__class__.nums) < 2  # checking that they don't share memory


if __name__ == '__main__':
    with Pool(maxtasksperchild=1) as pool:
        pool.map(A, range(99))  # the assert is being raised

EDIT because of answer by k.wahome: using instance attributes doesn't answer my question I need to use class attributes because in my original code (not shown here) i have several instances per process. 编辑,因为k.wahome的回答:使用实例属性不回答我的问题我需要使用类属性,因为在我的原始代码(这里没有显示)我每个进程有几个实例。 my question is specifically about the workings of a multiprocessing pool. 我的问题是关于多处理池的工作原理。


btw, doing the following does work 顺便说一句,做以下工作确实有效

from multiprocessing import Process

if __name__ == '__main__':
    prs = []
    for i in range(99):
        pr = Process(target=A, args=[i])
        pr.start()
        prs.append(pr)
    [pr.join() for pr in prs]
# the assert was not raised

The sharing is most likely coming in via the mapped class A with a class attribute nums . 共享很可能是通过带有类属性nums的映射类A进入的。

Class attributes are class bound thus belong to the class itself, are created when the class is loaded and they will be shared by all the instances. 类属性是类绑定的,因此属于类本身,在加载类时创建,并且它们将由所有实例共享。 All objects will have the same memory reference to a class attribute. 所有对象都具有与类属性相同的内存引用。

Unlike class attributes, instance attributes are instance bound and not shared by various instances. 与类属性不同,实例属性是实例绑定的,不是由各种实例共享的。 Every instance has its own copy of the instance attribute. 每个实例都有自己的实例属性副本。

See the class vs instance attribute effect: 请参阅类vs实例属性效果:

1. Using nums as a class attribute class_num.py 1.使用nums作为类属性 class_num.py

from multiprocessing import Pool


class A:
nums = []

def __init__(self, num=None):
    # I use 'self.__class__' for the sake of explicitly
    self.__class__.nums.append(num)
    print("nums:", self.__class__.nums)
    # checking that they don't share memory
    assert len(self.__class__.nums) < 2


if __name__ == '__main__':
with Pool(maxtasksperchild=1) as pool:
    print(pool)
    pool.map(A, range(99))  # the assert is being raised

Running this script 运行此脚本

>>> python class_num.py
nums: [0]
nums: [0, 1]
nums: [4]
nums: [4, 5]
nums: [8]
nums: [8, 9]
nums: [12]
nums: [12, 13]
nums: [16]
nums: [16, 17]
nums: [20]
nums: [20, 21]
nums: [24]
nums: [24, 25]
nums: [28]
nums: [28, 29]
nums: [32]
nums: [32, 33]
nums: [36]
nums: [36, 37]
nums: [40]
nums: [40, 41]
nums: [44]
nums: [44, 45]
nums: [48]
nums: [48, 49]
nums: [52]
nums: [52, 53]
nums: [56]
nums: [56, 57]
nums: [60]
nums: [60, 61]
nums: [64]
nums: [64, 65]
nums: [68]
nums: [68, 69]
nums: [72]
nums: [72, 73]
nums: [76]
nums: [76, 77]
nums: [80]
nums: [80, 81]
nums: [84]
nums: [84, 85]
nums: [88]
nums: [88, 89]
nums: [92]
nums: [92, 93]
nums: [96]
nums: [96, 97]
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "class_num.py", line 12, in __init__
    assert len(self.__class__.nums) < 2
AssertionError
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "class_num.py", line 18, in <module>
    pool.map(A, range(99))  # the assert is being raised
  File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 260, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 608, in get
    raise self._value
AssertionError

2. Using nums as an instance attribute instance_num.py 2.使用nums作为实例属性 instance_num.py

from multiprocessing import Pool


class A:

    def __init__(self, num=None):
        self.nums = []
        if num is not None:
            self.nums.append(num)
        print("nums:", self.nums)
        # checking that they don't share memory
        assert len(self.nums) < 2


if __name__ == '__main__':
    with Pool(maxtasksperchild=1) as pool:
        pool.map(A, range(99))  # the assert is being raised

Running this script 运行此脚本

>>> python instance_num.py
nums: [0]
nums: [1]
nums: [2]
nums: [3]
nums: [4]
nums: [5]
nums: [6]
nums: [7]
nums: [8]
nums: [9]
nums: [10]
nums: [11]
nums: [12]
nums: [13]
nums: [14]
nums: [15]
nums: [16]
nums: [17]
nums: [18]
nums: [19]
nums: [20]
nums: [21]
nums: [22]
nums: [23]
nums: [24]
nums: [25]
nums: [26]
nums: [27]
nums: [28]
nums: [29]
nums: [30]
nums: [31]
nums: [32]
nums: [33]
nums: [34]
nums: [35]
nums: [36]
nums: [37]
nums: [38]
nums: [39]
nums: [40]
nums: [41]
nums: [42]
nums: [43]
nums: [44]
nums: [45]
nums: [46]
nums: [47]
nums: [48]
nums: [49]
nums: [50]
nums: [51]
nums: [52]
nums: [53]
nums: [54]
nums: [55]
nums: [56]
nums: [57]
nums: [58]
nums: [59]
nums: [60]
nums: [61]
nums: [62]
nums: [63]
nums: [64]
nums: [65]
nums: [66]
nums: [67]
nums: [68]
nums: [69]
nums: [70]
nums: [71]
nums: [72]
nums: [73]
nums: [74]
nums: [75]
nums: [76]
nums: [77]
nums: [78]
nums: [79]
nums: [80]
nums: [81]
nums: [82]
nums: [83]
nums: [84]
nums: [85]
nums: [86]
nums: [87]
nums: [88]
nums: [89]
nums: [90]
nums: [91]
nums: [92]
nums: [93]
nums: [94]
nums: [95]
nums: [96]
nums: [97]
nums: [98]

Your observation has another reason. 你的观察有另一个原因。 The values in nums are not from other processes but from the same process when it starts hosting multiple instances of A. This happens because you didn't set chunksize to 1 in your pool.map -call. nums中的值不是来自其他进程,而是来自同一进程,当它开始托管多个A实例时。这是因为您没有在pool.map中将chunksize设置为1。 Setting maxtasksperchild=1 is not enough in your case because one task still consumes a whole chunk of the iterable. 在你的情况下设置maxtasksperchild=1还不够,因为一个任务仍然占用了迭代的一大块。

This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. 此方法将可迭代切换为多个块,并将其作为单独的任务提交给进程池。 The (approximate) size of these chunks can be specified by setting chunksize to a positive integer. 可以通过将chunksize设置为正整数来指定这些块的(近似)大小。 docs about map 关于地图的文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM