简体   繁体   English

当我多处理.pool.apply_async比我有处理器多次时会发生什么

[英]What happens when I multiprocessing.pool.apply_async more times than I have processors

I have the following setup: 我有以下设置:

results = [f(args) for _ in range(10**3)]

But, f(args) takes a long time to compute. 但是, f(args)需要很长时间才能计算出来。 So I'd like to throw multiprocessing at it. 所以我想抛出多处理。 I would like to do so by doing: 我想这样做:

pool = mp.pool(mp.cpu_count() -1) # mp.cpu_count() -> 8
results = [pool.apply_async(f, args) for _ in range(10**3)]

Clearly, I don't have 1000 processors on my computer, so my concern: 显然,我的计算机上没有1000个处理器,所以我担心:
Does the above call result in 1000 processes simultaneously competing for CPU time or 7 processes running simultaneously, iteratively computing the next f(args) when the previous call finishes? 以上调用是否导致1000个进程同时竞争CPU时间或7个进程同时运行,迭代计算前一个调用结束时的下一个f(args)

I suppose I could do something like pool.async_map(f, (args for _ in range(10**3))) to get the same results, but the purpose of this post is to understand the behavior of pool.apply_async 我想我可以做一些类似pool.async_map(f, (args for _ in range(10**3)))来获得相同的结果,但这篇文章的目的是了解pool.apply_async的行为

You'll never have more processes running than there are workers in your pool (in your case mp.cpu_count() - 1 . If you call apply_async and all the workers are busy, the task will be queued and executed as soon as a worker frees up. You can see this with a simple test program: 你的工作流程永远不会超过池中的工作者(在你的情况下是mp.cpu_count() - 1如果你调用apply_async并且所有工作人员都很忙,那么任务将在工作人员排队并执行后立即执行释放。你可以通过一个简单的测试程序看到这个:

#!/usr/bin/python

import time
import multiprocessing as mp

def worker(chunk):
    print('working')
    time.sleep(10)
    return

def main():
    pool = mp.Pool(2)  # Only two workers
    for n in range(0, 8):
        pool.apply_async(worker, (n,))
        print("called it")
    pool.close()
    pool.join()

if __name__ == '__main__':
    main()

The output is like this: 输出是这样的:

called it
called it
called it
called it
called it
called it
called it
called it
working
working
<delay>
working
working
<delay>
working 
working
<delay>
working
working

The number of worker processes is wholly controlled by the argument to mp.pool() . 工作进程的数量完全由mp.pool()的参数控制。 So if mp.cpu_count() returns 8 on your box, 7 worker processes will be created. 因此,如果mp.cpu_count()在您的框中返回8,则将创建7个工作进程。

All pool methods ( apply_async() among them) then use no more than that many worker processes. 所有pool方法(其中apply_async() )然后只使用那么多工作进程。 Under the covers, arguments are pickled in the main program and sent over an inter-process pipe to worker processes. 在封面下,参数在主程序中被pickle并通过进程间管道发送到工作进程。 This hidden machinery effectively creates a work queue, off of which the fixed number of worker processes pull descriptions of work to do (function name + arguments). 这个隐藏的机器有效地创建了一个工作队列,固定数量的工作进程从中拉出工作描述(函数名+参数)。

Other than that, it's all just magic ;-) 除此之外,它只是魔术;-)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Windows上的multiprocessing.Pool.apply_async - multiprocessing.Pool.apply_async on Windows multiprocessing.Pool.apply和multiprocessing.Pool.apply_async的目的 - Purpose of multiprocessing.Pool.apply and multiprocessing.Pool.apply_async 在 multiprocessing.Pool.apply_async function 中使用 multiprocessing.Pipe() 时发生死锁 - Deadlock occurs when using multiprocessing.Pipe() in multiprocessing.Pool.apply_async function 如何获得multiprocessing.Pool.apply_async的结果 - How to get the result of multiprocessing.Pool.apply_async 如何使用 multiprocessing.Pool.apply_async 登录到单个文件 - How to log to single file with multiprocessing.Pool.apply_async Python 3.6.8 - multiprocessing.Pool.apply_async() 不工作 - Python 3.6.8 - multiprocessing.Pool.apply_async() not working 当我在sock.recv(..)`上发送的字节多于发送给我的字节时会发生什么? - What happens when I `sock.recv(..)` on more bytes than have been sent to me? 为什么在multiprocessing.Pool()。apply_async()中使用了多个工人? - why is more than one worker used in `multiprocessing.Pool().apply_async()`? 当我从multiprocessing.Pool调用apply_async时,为什么会抛出“'module'对象没有属性XXX”错误? - Why would it throws “'module' object has no attribute XXX” error when I call on apply_async from multiprocessing.Pool? 在Linux下py3k多处理中,为什么我的线程数比我要求的池多? - how come I have more threads than processes I asked for my pool in py3k multiprocessing under Linux?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM