[英]Python multiprocessing: why are large chunksizes slower?
I've been profiling some code using Python's multiprocessing module (the 'job' function just squares the number). 我一直在使用Python的多处理模块分析一些代码('job'函数只是对数字进行平方)。
data = range(100000000)
n=4
time1 = time.time()
processes = multiprocessing.Pool(processes=n)
results_list = processes.map(func=job, iterable=data, chunksize=10000)
processes.close()
time2 = time.time()
print(time2-time1)
print(results_list[0:10])
One thing I found odd is that the optimal chunksize appears to be around 10k elements - this took 16 seconds on my computer. 我发现奇怪的一件事是最佳的chunksize似乎是大约10k元素 - 这在我的计算机上花了16秒。 If I increase the chunksize to 100k or 200k, then it slows to 20 seconds. 如果我将chunksize增加到100k或200k,那么它会减慢到20秒。
Could this difference be due to the amount of time required for pickling being longer for longer lists? 这种差异可能是由于长时间列表中酸洗所需的时间更长吗? A chunksize of 100 elements takes 62 seconds which I'm assuming is due to the extra time required to pass the chunks back and forth between different processes. 100个元素的块大小需要62秒,我假设是由于在不同进程之间来回传递块所需的额外时间。
About optimal chunksize: 关于最佳chunksize:
As both rules want different aproaches, a point in the middle is the way to go, similar to a supply-demand chart. 由于这两个规则都需要不同的方法,因此中间的一个点是要走的路,类似于供需图表。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.