[英]How to tell if an apply_async function has started or if it's still in the queue with multiprocessing.Pool
I'm using python's multiprocessing.Pool and apply_async to call a bunch of functions.我正在使用 python 的 multiprocessing.Pool 和 apply_async 来调用一堆函数。
How can I tell whether a function has started processing by a member of the pool or whether it is sitting in a queue?如何判断函数是否已由池成员开始处理,或者它是否处于队列中?
For example:例如:
import multiprocessing
import time
def func(t):
#take some time processing
print 'func({}) started'.format(t)
time.sleep(t)
pool = multiprocessing.Pool()
results = [pool.apply_async(func, [t]) for t in [100]*50] #adds 50 func calls to the queue
For each AsyncResult
in results
you can call ready()
or get(0)
to see if the func finished running.对于results
每个AsyncResult
,您可以调用ready()
或get(0)
以查看 func 是否已完成运行。 But how do you find out whether the func started but hasn't finished yet?但是你怎么知道 func 是否启动了但还没有完成呢?
ie for a given AsyncResult object (ie a given element of results) is there a way to see whether the function has been called or if it's sitting in the pool's queue?即对于给定的 AsyncResult 对象(即给定的结果元素),有没有办法查看该函数是否已被调用,或者它是否位于池的队列中?
First, remove completed jobs from results list首先,从结果列表中删除已完成的作业
results = [r for r in results if not r.ready()]
Number of processes pending is length of results list:待处理的进程数是结果列表的长度:
pending = len(results)
And number pending but not started is total pending - pool_size待处理但未启动的数量是待处理的总数 - pool_size
not_started = pending - pool_size
pool_size will be multiprocessing.cpu_count() if Pool is created with default argument as you did如果 Pool 是使用默认参数创建的,则 pool_size 将是 multiprocessing.cpu_count()
UPDATE : After initially misunderstanding the question, here's a way to do what OP was asking about.更新:在最初误解了这个问题之后,这里有一种方法可以做 OP 所问的问题。
I suspect this functionality could be added to the Pool class without too much trouble because AsyncResult is implemented by Pool with a Queue.我怀疑这个功能可以添加到 Pool 类中而不会有太多麻烦,因为 AsyncResult 是由带有队列的 Pool 实现的。 That queue could also be used internally to indicate whether started or not.该队列也可以在内部使用以指示是否已启动。
But here's a way to implement using Pool and Pipe.但是这里有一种使用 Pool 和 Pipe 来实现的方法。 NOTE: this doesn't work in Python 2.x -- not sure why.注意:这在 Python 2.x 中不起作用——不知道为什么。 Tested in Python 3.8.在 Python 3.8 中测试。
import multiprocessing
import time
import os
def worker_function(pipe):
pipe.send('started')
print('[{}] started pipe={}'.format(os.getpid(), pipe))
time.sleep(3)
pipe.close()
def test():
pool = multiprocessing.Pool(processes=2)
print('[{}] pool={}'.format(os.getpid(), pool))
workers = []
for x in range(1, 4):
parent, child = multiprocessing.Pipe()
pool.apply_async(worker_function, (child,))
worker = {'name': 'worker{}'.format(x), 'pipe': parent, 'started': False}
workers.append(worker)
pool.close()
while True:
for worker in workers:
if worker.get('started'):
continue
pipe = worker.get('pipe')
if pipe.poll(0.1):
message = pipe.recv()
print('[{}] {} says {}'.format(os.getpid(), worker.get('name'), message))
worker['started'] = True
pipe.close()
count_in_queue = len(workers)
for worker in workers:
if worker.get('started'):
count_in_queue -= 1
print('[{}] count_in_queue = {}'.format(os.getpid(), count_in_queue))
if not count_in_queue:
break
time.sleep(0.5)
pool.join()
if __name__ == '__main__':
test()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.