简体   繁体   English

使用joblib会使程序运行慢得多,为什么?

[英]Using joblib makes the program run much slower, why?

I have many many small tasks to do in a for loop. 我在for循环中有许多小任务要做。 I Want to use concurrency to speed it up. 我想使用并发来加快速度。 I used joblib for its easy to integrate. 我使用joblib是因为它易于集成。 However, I found using joblib makes my program run much slower than a simple for iteration. 但是,我发现使用joblib使我的程序比简单for迭代慢得多。 Here is the demo code: 这是演示代码:

import time
import random
from os import path
import tempfile
import numpy as np
import gc
from joblib import Parallel, delayed, load, dump

def func(a, i):
    '''a simple task for demonstration'''
    a[i] = random.random()

def memmap(a):
    '''use memory mapping to prevent memory allocation for each worker'''
    tmp_dir = tempfile.mkdtemp()
    mmap_fn = path.join(tmp_dir, 'a.mmap')
    print 'mmap file:', mmap_fn
    _ = dump(a, mmap_fn)        # dump
    a_mmap = load(mmap_fn, 'r+') # load
    del a
    gc.collect()
    return a_mmap

if __name__ == '__main__':
    N = 10000
    a = np.zeros(N)

    # memory mapping
    a = memmap(a)

    # parfor
    t0 = time.time()
    Parallel(n_jobs=4)(delayed(func)(a, i) for i in xrange(N))
    t1 = time.time()-t0

    # for 
    t0 = time.time()
    [func(a, i) for i in xrange(N)]
    t2 = time.time()-t0  

    # joblib time vs for time
    print t1, t2

On my laptop with i5-2520M CPU, 4 cores, Win7 64bit, the running time is 6.464s for joblib and 0.004s for simplely for loop. 在我的笔记本电脑CPU睿i5-2520M,4个核,64位的Win7,运行时间为6.464s的JOBLIB和0.004s为简单地for循环。

I've made the arguments as memory mapping to prevent the overhead of reallocation for each worker. 我已经将参数作为内存映射来防止每个工作人员的重新分配开销。 I've red this relative post , still not solved my problem. 我已将此红色帖子设为红色,但仍未解决我的问题。 Why is that happen? 为什么会这样呢? Did I missed some disciplines to correctly use joblib ? 我是否错过了某些学科以正确使用joblib

"Many small tasks" are not a good fit for joblib. “许多小任务”不适合Joblib。 The coarser the task granularity, the less overhead joblib causes and the more benefit you will have from it. 任务粒度越粗糙,joblib导致的开销越少,您从中得到的好处就越大。 With tiny tasks, the cost of setting up worker processes and communicating data to them will outweigh any any benefit from parallelization. 对于微小的任务,建立工作进程并与之通信数据的成本将超过并行化带来的任何好处。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM