简体   繁体   English

Python多处理:为什么较大的chunksize较慢?

[英]Python multiprocessing: why are large chunksizes slower?

I've been profiling some code using Python's multiprocessing module (the 'job' function just squares the number). 我一直在使用Python的多处理模块分析一些代码('job'函数只是对数字进行平方)。

data = range(100000000)
n=4
time1 = time.time()
processes = multiprocessing.Pool(processes=n)
results_list = processes.map(func=job, iterable=data, chunksize=10000)
processes.close()
time2 = time.time()
print(time2-time1)
print(results_list[0:10])

One thing I found odd is that the optimal chunksize appears to be around 10k elements - this took 16 seconds on my computer. 我发现奇怪的一件事是最佳的chunksize似乎是大约10k元素 - 这在我的计算机上花了16秒。 If I increase the chunksize to 100k or 200k, then it slows to 20 seconds. 如果我将chunksize增加到100k或200k,那么它会减慢到20秒。

Could this difference be due to the amount of time required for pickling being longer for longer lists? 这种差异可能是由于长时间列表中酸洗所需的时间更长吗? A chunksize of 100 elements takes 62 seconds which I'm assuming is due to the extra time required to pass the chunks back and forth between different processes. 100个元素的块大小需要62秒,我假设是由于在不同进程之间来回传递块所需的额外时间。

About optimal chunksize: 关于最佳chunksize:

  1. Having tons of small chunks would allow the 4 different workers to distribute the load more efficiently, thus smaller chunks would be desirable. 拥有大量的小块将允许4个不同的工人更有效地分配负载,因此需要更小的块。
  2. In the other hand, context changes related to processes add an overhead everytime a new chunk has to be processed, so less amount of context changes and therefore less chunks are desirable. 另一方面,每次必须处理新块时,与进程相关的上下文更改都会增加开销,因此需要更少量的上下文更改,因此需要更少的块。

As both rules want different aproaches, a point in the middle is the way to go, similar to a supply-demand chart. 由于这两个规则都需要不同的方法,因此中间的一个点是要走的路,类似于供需图表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM