简体   繁体   English

澄清Python的`multiprocessing.Pool`的`processes`参数

[英]Clarification on the `processes` argument for Python's `multiprocessing.Pool`

My question is if I execute [pool.apply_async(myfunc, args=(y,)) for i in range(8)] like shown below, and I initialized the Pool with multiple processes, eg, here 4 , 我的问题是,是否按如下所示对[pool.apply_async(myfunc, args=(y,)) for i in range(8)]执行[pool.apply_async(myfunc, args=(y,)) for i in range(8)] ,并使用多个进程(例如,此处4初始化Pool
does it mean that every function call is running parallel on 4 processes, and I am running 8 function calls parallel, too, so 4x8 = 32 processes, or does it run 4-times 1 function call, waits until they finish and then runs another 4 function calls? 这是否意味着每个函数调用都在4个进程上并行运行,我也正在并行运行8个函数调用,所以4x8 = 32个进程,或者它运行4次1次函数调用,等到它们完成然后再运行另一个4个函数调用?

import multiprocessing
pool = multiprocessing.Pool(processes=4)
results = [pool.apply_async(myfunc, args=(i,)) for i in range(8)]
results = [res.get() for res in results]

A multiprocessing.Pool will never run more processes in parallel than the number you specified at creation time. 一个multiprocessing.Pool并行运行的进程数量绝不会超过创建时指定的数量。 Instead, it immediately spawns as many processes as you specified, and leaves them running until the pool is closed/joined. 相反,它会立即生成您指定数量的进程,并使它们一直运行,直到关闭/加入池为止。 So in your case, the Pool will always be running exactly four processes, even if none of them are doing any work. 因此,就您而言,即使这些Pool没有一个正在执行任何工作, Pool也将始终正好运行四个进程。 If you give the pool eight work items, the first four will immediately begin executing in parallel, while the next four are queued. 如果给池提供八个工作项,则前四个将立即开始并行执行,而后四个则进入队列。 As soon as one of the worker processes finishes running myfunc , the first queued item will start being processed by the now idle worker process. 一旦其中一个工作进程完成运行myfunc ,第一个排队的项目将开始由现在空闲的工作进程处理。

You can see this for yourself if you run this example: 如果运行以下示例,则可以自己查看:

def myfunc(num):
    print("in here %s" % num)
    time.sleep(2)
    print("done with %s" % num)
    return num+2

if __name__ == "__main__":
    pool = multiprocessing.Pool(4)
    results = [pool.apply_async(myfunc, args=(i,)) for i in range(8)]
    results = [res.get() for res in results]
    print results

Output: 输出:

in here 0
in here 1
in here 2
in here 3
<2 second pause>
done with 0
done with 3
done with 1
in here 4
in here 5
in here 6
done with 2
in here 7
<2 second pause>
done with 6
done with 7
done with 4
done with 5
[2, 3, 4, 5, 6, 7, 8, 9]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM