简体   繁体   English

多处理管理器在 pool.apply_async 的非常简单的示例中失败

[英]Multiprocessing Manager failing on very simple example with pool.apply_async

I'm seeing some unexpected behavior in my code related to python multiprocessing , and the Manager class in particular.我在与 python multiprocessing相关的代码中看到了一些意外行为,尤其是与Manager class 相关的行为。 I wrote out a super simple example to try and better understand what's going on:我写了一个超级简单的例子来尝试更好地理解发生了什么:

import multiprocessing as mp
from collections import defaultdict


def process(d):
    print('doing the process')
    d['a'] = []
    d['a'].append(1)
    d['a'].append(2)


def main():
    pool = mp.Pool(mp.cpu_count())
    with mp.Manager() as manager:
        d = manager.dict({'c': 2})
        result = pool.apply_async(process, args=(d))
        print(result.get())

        pool.close()
        pool.join()

        print(d)


if __name__ == '__main__':
    main()

This fails, and the stack trace printed from result.get() is as follows:这失败了,从result.get()打印的堆栈跟踪如下:

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "<string>", line 2, in __iter__
  File "/usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/managers.py", line 825, in _callmethod
    proxytype = self._manager._registry[token.typeid][-1]
AttributeError: 'NoneType' object has no attribute '_registry'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "mp_test.py", line 34, in <module>
    main()
  File "mp_test.py", line 25, in main
    print(result.get())
  File "/usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 657, in get
    raise self._value
AttributeError: 'NoneType' object has no attribute '_registry'

I'm still unclear on what's happening here.我仍然不清楚这里发生了什么。 This seems to me to be a very, very straightforward application of the Manager class.在我看来,这似乎是 Manager class 的一个非常非常简单的应用程序。 It's nearly a copy of the actual example used in the official python documentation , with the only difference being that i'm using a pool and running the process with apply_async.它几乎是官方 python 文档中使用的实际示例的副本,唯一的区别是我使用的是池并使用 apply_async 运行进程。 I'm doing this because that's what i'm using in my actual project.我这样做是因为这就是我在实际项目中使用的。

To clarify, I wouldn't get a stack trace if I didn't have the result = and print(result.get()) in there.为了澄清,如果我没有result =print(result.get()) ,我不会得到堆栈跟踪。 I just see {'c': 2} printed when I run the script, which indicated to me that something was going wrong and wasn't being shown.我只是在运行脚本时看到{'c': 2}打印出来,这表明出现了问题并且没有显示出来。

A couple things to start with: first, this isn't the code you ran.有几件事要开始:首先,这不是您运行的代码。 The code you posted has您发布的代码有

  result = pool.apply_async(process2, args=(d))

but there is no process2() defined.但没有定义process2() Assuming "process` was intended, the next thing is the假设“过程”是有意的,接下来的事情是

args=(d)

part.部分。 That's the same as typing这和打字一样

args=d

but that's not what's needed .但这不是所需要的。 You need to pass a sequence of the intended arguments.您需要传递预期 arguments 的序列。 So you need to change that part to所以你需要把那部分改成

args=(d,) # build a 1-tuple

or或者

args=[d]  # build a list

Then the output changes, to然后 output 更改为

{'c': 2, 'a': []}

Why aren't 1 and 2 in the the 'a' list?为什么 1 和 2 不在“a”列表中? Because it's only the dict itself that lives on the manager server.因为只有 dict 本身存在于管理器服务器上。

d['a'].append(1)

first gets the mapping for 'a' from the server, which is an empty list.首先从服务器获取“a”的映射,这是一个空列表。 But that empty list is not shared in any way - it's local to process() .但是那个空列表不会以任何方式共享 - 它是本地的process() You append 1 to it, and then it's thrown away - the server knows nothing about it.你 append 1 到它,然后它就被扔掉了——服务器对此一无所知。 Same thing for 2. 2也是一样。

To get what you want, you need to "do something" to tell the manager server about what you changed;要得到你想要的,你需要“做点什么”告诉管理服务器你改变了什么; eg,例如,

d['a'] = L = []
L.append(1)
L.append(2)
d['a'] = L

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 python 2.7多处理。 另一个pool.apply_async中的pool.apply_async - python 2.7 multiprocessing. pool.apply_async inside another pool.apply_async 多处理pool.apply_async占用内存 - Multiprocessing pool.apply_async eats up memory 具有共享变量(值)的Python多处理Pool.apply_async - Python multiprocessing Pool.apply_async with shared variables (Value) 将Pipe / Connection作为上下文arg传递给多处理Pool.apply_async() - Passing a Pipe/Connection as context arg to multiprocessing Pool.apply_async() 无法使用 pool.apply_async 通过多处理聚合结果 - Unable to use pool.apply_async to aggregate results with multiprocessing 不能泡菜 <type 'instancemethod'> 使用python的多处理Pool.apply_async() - Can't pickle <type 'instancemethod'> using python's multiprocessing Pool.apply_async() Python 多处理 pool.apply_async() 结果访问通过 result.get() 抛出 TypeError: object is not callable - Python multiprocessing pool.apply_async() result access via result.get() throws TypeError: object is not callable pool.apply_async 和全局变量 - pool.apply_async and global variable Pool.apply_async():嵌套函数不执行 - Pool.apply_async(): nested function is not executed 在循环中等待 pool.apply_async - Wait for pool.apply_async inside a loop
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM