通过多线程和多处理并行处理比串行处理花费更多的时间

Question

I'm trying to learn how to do parallel programming in python. 我正在尝试学习如何在python中进行并行编程。 I wrote a simple int square function and then ran it in serial, multi-thread, and multi-process: 我编写了一个简单的int平方函数，然后在串行，多线程和多进程中运行它：

import time
import multiprocessing, threading
import random


def calc_square(numbers):
    sq = 0
    for n in numbers:
        sq = n*n

def splita(list, n):
    a = [[] for i in range(n)]
    counter = 0
    for i in range(0,len(list)):
        a[counter].append(list[i])
        if len(a[counter]) == len(list)/n:
            counter = counter +1
            continue
    return a


if __name__ == "__main__":

    random.seed(1)
    arr = [random.randint(1, 11) for i in xrange(1000000)]
    print "init completed"

    start_time2 = time.time()
    calc_square(arr)
    end_time2 = time.time()

    print "serial: " + str(end_time2 - start_time2)

    newarr = splita(arr,8)
    print 'split complete'

    start_time = time.time()

    for i in range(8):
        t1 = threading.Thread(target=calc_square, args=(newarr[i],))

        t1.start()
        t1.join()

    end_time = time.time()

    print "mt: " + str(end_time - start_time)

    start_time = time.time()

    for i in range(8):
        p1 = multiprocessing.Process(target=calc_square, args=(newarr[i],))
        p1.start()
        p1.join()

    end_time = time.time()

    print "mp: " + str(end_time - start_time)

Output: 输出：

init completed
serial: 0.0640001296997
split complete
mt: 0.0599999427795
mp: 2.97099995613

However, as you can see, something weird happened and mt is taking the same time as serial and mp is actually taking significantly longer (almost 50 times longer). 但是，正如您所看到的，发生了一些奇怪的事情，而mt与serial花费的时间相同，而mp实际上花费的时间更长（几乎长了50倍）。

What am I doing wrong? 我究竟做错了什么？ Could someone push me in the right direction to learn parallel programming in python? 有人可以向正确的方向推动我学习python并行编程吗？

Edit 01 编辑01

Looking at the comments, I see that perhaps the function not returning anything seems pointless. 查看评论，我发现也许不返回任何内容的函数似乎毫无意义。 The reason I'm even trying this is because previously I tried the following add function: 我什至尝试这样做的原因是因为以前我尝试了以下add函数：

def addi(numbers):
    sq = 0
    for n in numbers:
        sq = sq + n
    return sq

I tried returning the addition of each part to a serial number adder, so at least I could see some performance improvement over a pure serial implementation. 我尝试将每个部分的加法返回到序列号加法器，这样至少可以看到在纯串行实现方面性能有所提高。 However, I couldn't figure out how to store and use the returned value, and that's the reason I'm trying to figure out something even simpler than that, which is just dividing up the array and running a simple function on it. 但是，我无法弄清楚如何存储和使用返回的值，这就是我试图找出比这更简单的东西的原因，这仅仅是对数组进行分割并在其上运行一个简单的函数。

Thanks! 谢谢！

Answer 1

I think that multiprocessing takes quite a long time to create and start each process. 我认为， multiprocessing需要很长时间才能创建和启动每个进程。 I have changed the program to make 10 times the size of arr and changed the way that the processes are started and there is a slight speed-up: 我将程序更改为arr大小的10倍，并更改了进程启动的方式，并且略有加快：

(Also note python 3) （另请注意python 3）

import time
import multiprocessing, threading
from multiprocessing import Queue
import random

def calc_square_q(numbers,q):
    while q.empty():
        pass
    return calc_square(numbers)

if __name__ == "__main__":

    random.seed(1)   # note how big arr is now vvvvvvv
    arr = [random.randint(1, 11) for i in range(10000000)]
    print("init completed")

    # ...
    # other stuff as before
    # ...

    processes=[]
    q=Queue()
    for arrs in newarr:
        processes.append(multiprocessing.Process(target=calc_square_q, args=(arrs,q)))

    print('start processes')
    for p in processes:
        p.start()  # even tho' each process is started it waits...

    print('join processes')
    q.put(None)   # ... for q to become not empty.
    start_time = time.time()
    for p in processes:
        p.join()

    end_time = time.time()

    print("mp: " + str(end_time - start_time))

Also notice above how I create and start the processes in two different loops, and then finally join with the processes in a third loop. 还要注意上面的说明，我是如何在两个不同的循环中创建和启动流程，然后最后在第三个循环中加入这些流程的。

Output: 输出：

init completed
serial: 0.53214430809021
split complete
start threads
mt: 0.5551605224609375
start processes
join processes
mp: 0.2800724506378174

Another factor of 10 increase in size of arr : arr大小增加10的另一个因素：

init completed
serial: 5.8455305099487305
split complete
start threads
mt: 5.411392450332642
start processes
join processes
mp: 1.9705185890197754

And yes, I've also tried this in python 2.7, although Threads seemed slower. 是的，尽管Threads似乎较慢，但我也在python 2.7中进行了尝试。

通过多线程和多处理并行处理比串行处理花费更多的时间

问题描述

1 个解决方案

解决方案1
2 已采纳 2017-12-08 20:48:39

通过多线程和多处理并行处理比串行处理花费更多的时间

问题描述

1 个解决方案

解决方案1 2 已采纳 2017-12-08 20:48:39

解决方案1
2 已采纳 2017-12-08 20:48:39