简体   繁体   中英

joblib parallel compuction time

Joblib for parallel computation taking more time for njob>1 (njob=2 takes 12.6s finished) than njob=1 (1.3s finished). I am in mac OSX 10.9 with 16GB RAM. Am I doing some mistake? Here is a simple demo code:

from joblib import Parallel, delayed
def func():
    for i in range(200):
        for j in range(300):
            yield i, j 

def evaluate(x):
    i=x[0]
    j=x[1]
    p=i*j
    return p, i, j

if __name__ == '__main__':
    results = Parallel(n_jobs=3, verbose=2)(delayed(evaluate)(x) for x in func())
    res, i, j = zip(*results)

Short answer: Joblib is a multiprocessing system, and has a fair amount of overhead in booting up a new python process for each of your 3 simultaneous jobs. As a result, your specific code is likely to get even slower if you add more jobs.

There's some documentation about this here .

The workarounds aren't great:

  1. accept the overhead
  2. don't use parallel code
  3. Use multithreading instead of multiprocessing.. Unfortunately, multithreading is rarely an option unless you are using a fully compiled function in place of evaluate , because python is almost always single-threaded (see the python GIL).

That said, for functions that take a long time, multiprocessing is often worth it. Depending on your application, it's really a judgment call. Note that every variable used in the function is copied to each process - variable copy is rare in python, so this can be a surprise. As a result, the overhead is in part a function of the size of the variables passed either explicitly or implicitly (eg. via use of global variables).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM