[英]multiprocessing.Pool: When to use apply, apply_async or map?
[英]Why would it throws “'module' object has no attribute XXX” error when I call on apply_async from multiprocessing.Pool?
代码如下。 当我将它复制并粘贴到我的cmd提示符中时,它会抛出'module'对象没有属性'func' ,但是当我将它保存为.py文件并执行python test.py
,它运行正常。
import multiprocessing
import time
def func(msg):
for i in xrange(3):
print msg
time.sleep(1)
if __name__ == '__main__':
pool = multiprocessing.Pool(processes=4)
for i in xrange(5):
msg = "hello %d" %(i)
pool.apply_async(func, (msg, ))
pool.close()
pool.join()
print "Sub-process(es) done."
任何人都可以在运行python代码时给出解释提示和文件之间的区别吗? 非常感谢!
发生这种情况是因为在Windows上,需要对func
进行pickle并通过IPC将其发送到子进程。 为了让孩子解开func
,它需要能够从父的__main__
模块中导入它。 当在普通Python脚本中发生这种情况时,子__main__
可以重新导入脚本,而__main__
将包含在脚本顶层声明的所有函数,因此它可以正常工作。 但是,在交互式解释器中,您在解释器中定义的函数不能简单地从正常脚本中的文件中重新导入,因此它们不会在子__main__
中的__main__
中。 如果您直接使用multiprocessing.Process
重新创建问题,则会更加清楚:
>>> def f():
... print "HI"
...
>>> import multiprocessing
>>> p = multiprocessing.Process(target=f)
>>> p.start()
>>> Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\python27\lib\multiprocessing\forking.py", line 381, in main
self = load(from_parent)
File "C:\python27\lib\pickle.py", line 1378, in load
return Unpickler(file).load()
File "C:\python27\lib\pickle.py", line 858, in load
dispatch[key](self)
File "C:\python27\lib\pickle.py", line 1090, in load_global
klass = self.find_class(module, name)
File "C:\python27\lib\pickle.py", line 1126, in find_class
klass = getattr(mod, name)
AttributeError: 'module' object has no attribute 'f'
这样,更清楚的是, pickle
找不到模块。 如果你向pickle.py
添加一些跟踪,你可以看到'module'
指的是__main__
:
def load_global(self):
module = self.readline()[:-1]
name = self.readline()[:-1]
print("module {} name {}".format(module, name)) # I added this.
klass = self.find_class(module, name)
self.append(klass)
再次使用额外的print语句重新运行相同的代码会产生以下结果:
module multiprocessing.process name Process
module __main__ name f
< same traceback as before>
值得注意的是,此示例在Posix平台上实际上工作正常,因为os.fork()
用于生成子进程,这意味着在创建Pool
之前定义的任何函数都将在子进程的__main__
模块中可用。 因此,虽然上面的示例将起作用,但是这个仍然会失败,因为在创建Pool
之后定义了worker函数(这意味着在os.fork()
之后):
>>> import multiprocessing
>>> p = multiprocessing.Pool(2)
>>> def f(a):
... print(a)
...
>>> p.apply(f, "hi")
Process PoolWorker-1:
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/process.py", line 231, in _bootstrap
self.run()
File "/usr/lib64/python2.6/multiprocessing/process.py", line 88, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib64/python2.6/multiprocessing/pool.py", line 57, in worker
task = get()
File "/usr/lib64/python2.6/multiprocessing/queues.py", line 339, in get
return recv()
AttributeError: 'module' object has no attribute 'f'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.