Parallelizing through Multi-threading and Multi-processing taking significantly more time than serial

Question

I'm trying to learn how to do parallel programming in python. I wrote a simple int square function and then ran it in serial, multi-thread, and multi-process:

import time
import multiprocessing, threading
import random


def calc_square(numbers):
    sq = 0
    for n in numbers:
        sq = n*n

def splita(list, n):
    a = [[] for i in range(n)]
    counter = 0
    for i in range(0,len(list)):
        a[counter].append(list[i])
        if len(a[counter]) == len(list)/n:
            counter = counter +1
            continue
    return a


if __name__ == "__main__":

    random.seed(1)
    arr = [random.randint(1, 11) for i in xrange(1000000)]
    print "init completed"

    start_time2 = time.time()
    calc_square(arr)
    end_time2 = time.time()

    print "serial: " + str(end_time2 - start_time2)

    newarr = splita(arr,8)
    print 'split complete'

    start_time = time.time()

    for i in range(8):
        t1 = threading.Thread(target=calc_square, args=(newarr[i],))

        t1.start()
        t1.join()

    end_time = time.time()

    print "mt: " + str(end_time - start_time)

    start_time = time.time()

    for i in range(8):
        p1 = multiprocessing.Process(target=calc_square, args=(newarr[i],))
        p1.start()
        p1.join()

    end_time = time.time()

    print "mp: " + str(end_time - start_time)

Output:

init completed
serial: 0.0640001296997
split complete
mt: 0.0599999427795
mp: 2.97099995613

However, as you can see, something weird happened and mt is taking the same time as serial and mp is actually taking significantly longer (almost 50 times longer).

What am I doing wrong? Could someone push me in the right direction to learn parallel programming in python?

Edit 01

Looking at the comments, I see that perhaps the function not returning anything seems pointless. The reason I'm even trying this is because previously I tried the following add function:

def addi(numbers):
    sq = 0
    for n in numbers:
        sq = sq + n
    return sq

I tried returning the addition of each part to a serial number adder, so at least I could see some performance improvement over a pure serial implementation. However, I couldn't figure out how to store and use the returned value, and that's the reason I'm trying to figure out something even simpler than that, which is just dividing up the array and running a simple function on it.

Thanks!

Answer 1

I think that multiprocessing takes quite a long time to create and start each process. I have changed the program to make 10 times the size of arr and changed the way that the processes are started and there is a slight speed-up:

(Also note python 3)

import time
import multiprocessing, threading
from multiprocessing import Queue
import random

def calc_square_q(numbers,q):
    while q.empty():
        pass
    return calc_square(numbers)

if __name__ == "__main__":

    random.seed(1)   # note how big arr is now vvvvvvv
    arr = [random.randint(1, 11) for i in range(10000000)]
    print("init completed")

    # ...
    # other stuff as before
    # ...

    processes=[]
    q=Queue()
    for arrs in newarr:
        processes.append(multiprocessing.Process(target=calc_square_q, args=(arrs,q)))

    print('start processes')
    for p in processes:
        p.start()  # even tho' each process is started it waits...

    print('join processes')
    q.put(None)   # ... for q to become not empty.
    start_time = time.time()
    for p in processes:
        p.join()

    end_time = time.time()

    print("mp: " + str(end_time - start_time))

Also notice above how I create and start the processes in two different loops, and then finally join with the processes in a third loop.

Output:

init completed
serial: 0.53214430809021
split complete
start threads
mt: 0.5551605224609375
start processes
join processes
mp: 0.2800724506378174

Another factor of 10 increase in size of arr :

init completed
serial: 5.8455305099487305
split complete
start threads
mt: 5.411392450332642
start processes
join processes
mp: 1.9705185890197754

And yes, I've also tried this in python 2.7, although Threads seemed slower.

Parallelizing through Multi-threading and Multi-processing taking significantly more time than serial

Question

1 answers

solution1
2 ACCPTED 2017-12-08 20:48:39

Parallelizing through Multi-threading and Multi-processing taking significantly more time than serial

Question

1 answers

solution1 2 ACCPTED 2017-12-08 20:48:39

solution1
2 ACCPTED 2017-12-08 20:48:39