Python多处理/线程比虚拟机上的单个处理花费更长的时间

Question

I'm at work on a virtual machine which sits in my company's mainframe. 我正在使用位于我公司大型机中的虚拟机。

I have 4 cores assigned to work with so I'm trying to get into parallel processing of my Python code. 我分配了4个核心，所以我正在尝试并行处理我的Python代码。 I'm not familiar with it yet and I'm running into really unexpected behaviour, namely that multiprocessing/threading takes longer than single processing. 我还不熟悉它，我遇到了意想不到的行为，即多处理/线程比单个处理需要更长的时间。 I can't tell if I'm doing something wrong or if the problem comes from my virtual machine. 我不知道我做错了什么，或者问题来自我的虚拟机。

Here's an example: 这是一个例子：

import multiprocessing as mg
import threading
import math
import random
import time

NUM = 4

def benchmark():
  for i in range(1000000):
    math.exp(random.random())

threads = []
random.seed()

print "Linear Processing:"
time0 = time.time()
for i in range(NUM):
  benchmark()
print time.time()-time0

print "Threading:"
for P in range(NUM):
  threads.append(threading.Thread(target=benchmark))
time0 = time.time()
for t in threads:
  t.start()
for t in threads:
  t.join()
print time.time()-time0

threads = []
print "Multiprocessing:"
for i in range(NUM):
  threads.append(mg.Process(target=benchmark))
time0 = time.time()
for t in threads:
  t.start()
for t in threads:
  t.join()
print time.time()-time0

The result from this is like this: 结果是这样的：

Linear Processing:
1.125
Threading:
4.56699991226
Multiprocessing:
3.79200005531

Linear processing is the fastest here which is the opposite of what I want and expected. 线性处理在这里是最快的，这与我想要和期望的相反。 I'm unsure about how the join statements should be executed, so I also did the example with the joins like this: 我不确定如何执行join语句，所以我也用这样的连接做了一个例子：

for t in threads:
  t.start()
  t.join()

Now this leads to output like this: 现在这导致输出如下：

Linear Processing:
1.11500000954
Threading:
1.15300011635
Multiprocessing:
9.58800005913

Now threading is almost as fast as single processing, while multiprocessing is even slower. 现在线程几乎和单个处理一样快，而多处理甚至更慢。

When observing processor load in the task manager the individual load of the four virtual cores never rises over 30% even while doing the multiprocessing, so I'm suspecting a configurational problem here. 在任务管理器中观察处理器负载时，即使在进行多处理时，四个虚拟内核的单个负载也不会超过30％，因此我怀疑这里存在配置问题。

I want to know if I'm doing the benchmarking correctly and if that behaviour is really as strange as I think it is. 我想知道我是否正确地进行基准测试，如果这种行为真的像我想的那样奇怪。

Answer 1

So, firstly, you're not doing anything wrong, and when I run your example on my Macbook Pro, with cPython 2.7.12, I get: 所以，首先，你没有做错任何事情，当我在我的Macbook Pro上运行你的例子时，使用cPython 2.7.12，我得到：

$ python test.py
Linear Processing:
0.733351945877
Threading:
1.20692706108
Multiprocessing:
0.256340026855

However, the difference becomes more apparent when I change: 但是，当我改变时，差异变得更加明显：

for i in range(1000000):

To: 至：

for i in range(100000000):

The difference is much more noticeable: 差异更明显：

Linear Processing:
77.5861060619
Threading:
153.572453976
Multiprocessing:
33.5992660522

Now why is threading consistently slower? 现在为什么线程一直变慢？ Because of the Global Interpreter Lock. 因为Global Interpreter Lock。 The only thing the threading module is good for is waiting on I/O. threading模块唯一有用的就是等待I / O. Your multiprocessing example is the correct way to do this. 您的multiprocessing示例是执行此操作的正确方法。

So, in your original example, where Linear Processing was the fastest, I would blame this on the overhead of starting processes. 因此，在您的原始示例中， Linear Processing是最快的，我会将此归咎于启动流程的开销。 When you're doing a small amount of work, it may often be the case that it takes more time to start 4 processes and wait for them to finish, than to just do the work synchronously in a single process. 当您进行少量工作时，通常可能需要花费更多时间来启动4个进程并等待它们完成，而不是仅仅在一个进程中同步完成工作。 Use a larger workload to benchmark more realistically. 使用更大的工作量来更逼真地进行基准测试。

Python多处理/线程比虚拟机上的单个处理花费更长的时间

问题描述

1 个解决方案

解决方案1
6 已采纳 2016-08-05 10:19:26

Python多处理/线程比虚拟机上的单个处理花费更长的时间

问题描述

1 个解决方案

解决方案1 6 已采纳 2016-08-05 10:19:26

解决方案1
6 已采纳 2016-08-05 10:19:26