简体   繁体   中英

Python multiprocessing slower than single thread

I have been playing around with multiprocessing problem and notice my algorithm is slower when I parallelizes it than when it is single thread.

In my code I don't share memory. And I'm pretty sure my algorithm (see code), which is just nested loops is CPU bound.

However, no matter what I do. The parallel code runs 10-20% slower on all my computers.

I also ran this on a 20 CPUs virtual machine and single thread beats multithread every times (even slower up there than my computer, actually).

from multiprocessing.dummy import Pool as ThreadPool
from multi import chunks
from random import random
import logging
import time
from multi import chunks

## Product two set of stuff we can iterate over
S = []
for x in range(100000):
  S.append({'value': x*random()})
H =[]
for x in range(255):
  H.append({'value': x*random()})

# the function for each thread
# just nested iteration
def doStuff(HH):
  R =[]
  for k in HH['S']:
    for h in HH['H']:
      R.append(k['value'] * h['value'])
  return R

# we will split the work
# between the worker thread and give it
# 5 item each to iterate over the big list
HChunks = chunks(H, 5)
XChunks = []

# turn them into dictionary, so i can pass in both
# S and H list
# Note: I do this because I'm not sure if I use the global
# S, will it spend too much time on cache synchronizatio or not
# the idea is that I dont want each thread to share anything.
for x in HChunks:
  XChunks.append({'H': x, 'S': S})

print("Process")
t0 = time.time()
pool = ThreadPool(4)
R = pool.map(doStuff, XChunks)
pool.close()
pool.join()

t1 = time.time()

# measured time for 4 threads is slower 
# than when i have this code just do 
# doStuff(..) in non-parallel way
# Why!?

total = t1-t0
print("Took", total, "secs")

There are many related question opened, but many are geared toward code being structured incorrectly - each worker being IO bound and such.

You are using multithreading , not multiprocessing . While many languages allow threads to run in parallel, python does not. A thread is just a separate state of control, ie it holds it own stack, current function, etc. The python interpreter just switches between executing each stack every now and then.

Basically, all threads are running on a single core. They will only speed up your program when you are not CPU bound.

multiprocessing.dummy replicates the API of multiprocessing but is no more than a wrapper around the threading module.

Multithreading is usually slower than single threading if you are CPU bound. This is because the work and processing resources stay the same, but you add overhead for managing the threads, eg switching between them.

How to fix this : instead of using from multiprocessing.dummy import Pool as ThreadPool do multiprocessing.Pool as ThreadPool .


You might want to read up on the GIL, the Global Interpreter Lock. It's what prevents threads from running in parallel (that and implications on single threaded performance). Python interpreters other than CPython may not have the GIL and be able to run multithreaded on several cores.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM