簡體   English   中英

通過多線程和多處理並行處理比串行處理花費更多的時間

[英]Parallelizing through Multi-threading and Multi-processing taking significantly more time than serial

我正在嘗試學習如何在python中進行並行編程。 我編寫了一個簡單的int平方函數,然后在串行,多線程和多進程中運行它:

import time
import multiprocessing, threading
import random


def calc_square(numbers):
    sq = 0
    for n in numbers:
        sq = n*n

def splita(list, n):
    a = [[] for i in range(n)]
    counter = 0
    for i in range(0,len(list)):
        a[counter].append(list[i])
        if len(a[counter]) == len(list)/n:
            counter = counter +1
            continue
    return a


if __name__ == "__main__":

    random.seed(1)
    arr = [random.randint(1, 11) for i in xrange(1000000)]
    print "init completed"

    start_time2 = time.time()
    calc_square(arr)
    end_time2 = time.time()

    print "serial: " + str(end_time2 - start_time2)

    newarr = splita(arr,8)
    print 'split complete'

    start_time = time.time()

    for i in range(8):
        t1 = threading.Thread(target=calc_square, args=(newarr[i],))

        t1.start()
        t1.join()

    end_time = time.time()

    print "mt: " + str(end_time - start_time)

    start_time = time.time()

    for i in range(8):
        p1 = multiprocessing.Process(target=calc_square, args=(newarr[i],))
        p1.start()
        p1.join()

    end_time = time.time()

    print "mp: " + str(end_time - start_time)

輸出:

init completed
serial: 0.0640001296997
split complete
mt: 0.0599999427795
mp: 2.97099995613

但是,正如您所看到的,發生了一些奇怪的事情,而mt與serial花費的時間相同,而mp實際上花費的時間更長(幾乎長了50倍)。

我究竟做錯了什么? 有人可以向正確的方向推動我學習python並行編程嗎?

編輯01

查看評論,我發現也許不返回任何內容的函數似乎毫無意義。 我什至嘗試這樣做的原因是因為以前我嘗試了以下add函數:

def addi(numbers):
    sq = 0
    for n in numbers:
        sq = sq + n
    return sq

我嘗試將每個部分的加法返回到序列號加法器,這樣至少可以看到在純串行實現方面性能有所提高。 但是,我無法弄清楚如何存儲和使用返回的值,這就是我試圖找出比這更簡單的東西的原因,這僅僅是對數組進行分割並在其上運行一個簡單的函數。

謝謝!

我認為, multiprocessing需要很長時間才能創建和啟動每個進程。 我將程序更改為arr大小的10倍,並更改了進程啟動的方式,並且略有加快:

(另請注意python 3)

import time
import multiprocessing, threading
from multiprocessing import Queue
import random

def calc_square_q(numbers,q):
    while q.empty():
        pass
    return calc_square(numbers)

if __name__ == "__main__":

    random.seed(1)   # note how big arr is now vvvvvvv
    arr = [random.randint(1, 11) for i in range(10000000)]
    print("init completed")

    # ...
    # other stuff as before
    # ...

    processes=[]
    q=Queue()
    for arrs in newarr:
        processes.append(multiprocessing.Process(target=calc_square_q, args=(arrs,q)))

    print('start processes')
    for p in processes:
        p.start()  # even tho' each process is started it waits...

    print('join processes')
    q.put(None)   # ... for q to become not empty.
    start_time = time.time()
    for p in processes:
        p.join()

    end_time = time.time()

    print("mp: " + str(end_time - start_time))

還要注意上面的說明,我是如何在兩個不同的循環中創建和啟動流程,然后最后在第三個循環中加入這些流程的。

輸出:

init completed
serial: 0.53214430809021
split complete
start threads
mt: 0.5551605224609375
start processes
join processes
mp: 0.2800724506378174

arr大小增加10的另一個因素:

init completed
serial: 5.8455305099487305
split complete
start threads
mt: 5.411392450332642
start processes
join processes
mp: 1.9705185890197754

是的,盡管Threads似乎較慢,但我也在python 2.7中進行了嘗試。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM