[英]python multiprocessing.pool.map, passing arguments to spawned processes
[英]Python multiprocessing.Pool.map behavior when list is longer than number of processes
当提交比进程数长的任务列表时,进程如何分配给这些任务?
from multiprocessing import Pool
def f(i):
print(i)
return i
with Pool(2) as pool:
print(pool.map(f, [1, 2, 3, 4, 5]))
我正在运行一个更复杂的函数,并且执行似乎没有顺序(FIFO)。
这是一些示例代码:
from multiprocessing import Pool
from time import sleep
def f(x):
print(x)
sleep(0.1)
return x * x
if __name__ == '__main__':
with Pool(2) as pool:
print(pool.map(f, range(100)))
打印出来:
0
13
1
14
2
15
3
16
4
...
如果我们查看multiprocessing
的相关源代码:
def _map_async(self, func, iterable, mapper, chunksize=None, callback=None,
error_callback=None):
'''
Helper function to implement map, starmap and their async counterparts.
'''
self._check_running()
if not hasattr(iterable, '__len__'):
iterable = list(iterable)
if chunksize is None:
chunksize, extra = divmod(len(iterable), len(self._pool) * 4)
if extra:
chunksize += 1
if len(iterable) == 0:
chunksize = 0
task_batches = Pool._get_tasks(func, iterable, chunksize)
这里我们有len(iterable) == 100
, len(self._pool) * 4 == 8
,所以chunksize, extra = 12, 4
导致chunksize = 13
,因此输出显示任务被分成 13 个批次.
Pool
类代表一个工作进程池。 一旦现有进程上的一个进程完成,它就会产生新进程。 为了更好地理解,我们设置了一个chunksize=1
,考虑一下代码,
from multiprocessing import Pool
from time import sleep
def f(x):
print(f"Task {x} enter")
sleep(5)
print(f"Task {x} exit")
return x * x
if __name__ == '__main__':
with Pool(2) as pool:
print(pool.map(f, range(10), chunksize=1))
所以执行的顺序是,
Task 0 enter
Task 1 enter
Task 0 exit
Task 2 enter
Task 1 exit
Task 3 enter
Task 2 exit
Task 4 enter
Task 3 exit
Task 5 enter
Task 4 exit
Task 6 enter
Task 5 exit
Task 7 enter
Task 6 exit
Task 8 enter
Task 7 exit
Task 9 enter
Task 8 exit
Task 9 exit
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.