[英]How to understand multiprocessing.Queue when working with multiprocessing.Pool?
Why can't I put process
in Pool
into a Queue
? 为什么不能将
Pool
process
放入Queue
?
Here my code works when using Pool
and can get Test
instance attributes. 在这里,我的代码在使用
Pool
并且可以获取Test
实例属性。
from multiprocessing import Pool
from multiprocessing import Queue
class Test(object):
def __init__(self, num):
self.num = num
if __name__ == '__main__':
p = Pool()
procs = []
for i in range(5):
proc = p.apply_async(Test, args=(i,))
procs.append(proc)
p.close()
for each in procs:
test = each.get(10)
print(test.num)
p.join()
When I try to use Queue
not python list
to store processes, this won't work. 当我尝试使用
Queue
not python list
来存储进程时,这将无法工作。
My code: 我的代码:
from multiprocessing import Pool
from multiprocessing import Queue
class Test(object):
def __init__(self, num):
self.num = num
if __name__ == '__main__':
p = Pool()
q = Queue()
for i in range(5):
proc = p.apply_async(Test, args=(i,))
q.put(proc)
p.close()
while not q.empty():
q.get()
p.join()
Error msg: 错误消息:
Traceback (most recent call last):
File "C:\Users\laich\AppData\Local\Programs\Python\Python36-
32\lib\multiprocessing\queues.py", line 234, in _feed
obj = _ForkingPickler.dumps(obj)
File "C:\Users\laich\AppData\Local\Programs\Python\Python36-
32\lib\multiprocessing\reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
TypeError: can't pickle _thread.lock objects
I go see the multiprocessing doc: 我去看多处理文档:
class multiprocessing.Queue([maxsize])
Returns a process shared queue implemented using a pipe and a few locks/semaphores.class multiprocessing.Queue([maxsize])
返回使用管道和一些锁/信号量实现的进程共享队列。 When a process first puts an item on the queue a feeder thread is started which transfers objects from a buffer into the pipe.当进程首先将项目放入队列时,将启动一个供料器线程,该线程将对象从缓冲区转移到管道中。
The usual
queue.Empty
andqueue.Full
exceptions from the standard library's queue module are raised to signal timeouts.标准库的队列模块中的通常
queue.Empty
和queue.Full
异常引发了超时。Queue implements all the methods of
queue.Queue
except fortask_done()
andjoin()
.除了
task_done()
和join()
之外,Queue实现了queue.Queue
所有方法。
Here it says "puts an item", this item can't be anything (python object)? 这里说“放一个项目”,这个项目不能是任何东西(python对象)? In my case I try to put
process
in Pool()
into Queue
. 就我而言,我尝试将
Pool()
process
放入Queue
。
There are at least two problems with your Queue
-based code. 基于
Queue
的代码至少存在两个问题。 Pool.apply_async
method returns an AsyncResult
object, not a process. Pool.apply_async
方法返回一个AsyncResult
对象,而不是一个进程。 You can call get
on this object to obtain the result of the corresponding process. 您可以在此对象上调用
get
获得相应过程的结果。 With this difference in mind let's look at your code: 考虑到这种差异,让我们看一下您的代码:
proc = p.apply_async(Test, args=(i,)) # Returns an AsyncResult object
q.put(proc) # won't work
The second line will always fail in your case. 在您的情况下,第二行将始终失败。 Anything that you put in a queue must be picklable, because
multiprocess.Queue
uses serialization. 您放入队列中的任何内容都必须是可挑剔的,因为
multiprocess.Queue
使用序列化。 This is not well documented and there is an open issue in Python's issue tracker to update the documentation. 这没有很好的文档记录,Python的问题跟踪器中有一个未解决的问题,用于更新文档。 The problem is that
AsyncResult
is not picklable. 问题是
AsyncResult
不可腌制。 You can try yourself: 您可以尝试一下:
import pickle
import multiprocessing as mp
with mp.Pool() as p:
result = p.apply_async(lambda x: x, (1,))
pickle.dumps(result) # Error
AsyncResult
contains some lock objects internally and they are not serializable. AsyncResult
内部包含一些锁定对象,并且它们不可序列化。 Let' move to the next problem: 让我们转到下一个问题:
while not q.empty():
q.get()
If I'm not wrong, in the code above you want to call AsyncResult.get
and not Queue.get
. 如果我没看错,在上面的代码中,您要调用
AsyncResult.get
而不是Queue.get
。 In this case you have to first get your object from the queue and then call the corresponding method on your object. 在这种情况下,您必须首先从队列中获取对象,然后在对象上调用相应的方法。 However this is not the case in your code, since
AsyncResult
is not serializable. 但是,由于
AsyncResult
不可序列化,因此在您的代码中情况并非如此。
As @Mehdi Sadeghi explained , AsyncResult
objects can't be pickled, which multiprocessing.Queue
s requires. 正如@Mehdi Sadeghi 解释的那样 ,无法对
AsyncResult
对象进行腌制,而multiprocessing.Queue
则需要这样做。 However you don't need one here because the queue isn't being shared among the processes. 但是,这里不需要一个队列,因为队列没有在进程之间共享。 This mean you can just use a regular
Queue
. 这意味着您可以只使用常规
Queue
。
from multiprocessing import Pool
#from multiprocessing import Queue
from queue import Queue
class Test(object):
def __init__(self, num):
self.num = num
print('Test({!r}) created'.format(num))
if __name__ == '__main__':
p = Pool()
q = Queue()
for i in range(5):
proc = p.apply_async(Test, args=(i,))
q.put(proc)
p.close()
while not q.empty():
q.get()
p.join()
print('done')
Output: 输出:
Test(0)
Test(1)
Test(2)
Test(3)
Test(4)
done
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.