简体   繁体   中英

how to parallelize a loop in python with a function with several arguments?

I am trying to parallelize a for loop in python that must execute a function with several arguments, one of which will be changing through the loop. The loop itself needs to be embedded in a function. I have already looked here , here and here in stackoverflow and beyond ( here and here ) but I just cannot make it work:(

Below is a MWE:

import time
import numpy as np
from multiprocessing import Pool
from functools import partial

def mytestFun(otherStuff, myparams):
    return myparams[0]*otherStuff - myparams[1]

def myfun1(extraParams, mylist):
    [myMat, otherStuff] = extraParams
    
    for ivals in mylist:
        myparams = myMat[ivals,:]
        result = mytestFun(otherStuff, myparams)
    return result

if __name__ == '__main__':
    a_list = [0, 1, 2, 3, 4, 5]

    myMat = np.random.uniform(0,1,(6,2))
    extraParams = [myMat, 5]
    print(myfun1(extraParams, a_list))
    pool = Pool()
    func = partial(myfun1, extraParams)
    pool.map(func, a_list)
    pool.close()
    pool.join()

And I keep getting errors that I don't know how to interpret:

Traceback (most recent call last):
  File "exampleMultiProcessing.py", line 61, in <module>
    pool.map(func, a_list)
  File "/Users/laurama/miniconda3/lib/python3.7/multiprocessing/pool.py", line 268, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/Users/laurama/miniconda3/lib/python3.7/multiprocessing/pool.py", line 657, in get
    raise self._value
TypeError: cannot unpack non-iterable int object

Thanks in advance!

You can read about joblib here . Basically when we use joblib, it expects that we will be passing it the args for the function which we want to parallelise. So here i am passing the args directly to the function, that's why i am looping using the underscore_variable , you can use anything there, no issues at all. Basically I am ignoring the looping variable using the _ ;

And yes, Parallel automatically will distribute it over n_cores;

Try this:

from joblib import Parallel, delayed    

if __name__ == '__main__': 
    a_list = [0, 1, 2, 3, 4, 5] 
    myMat = np.random.uniform(0,1,(6,2)) 
    extraParams = [myMat, 5] 
    print(myfun1(extraParams, a_list)) 
    result = Parallel(n_jobs=8)(delayed(myfun1)(extraParams, a_list) for _ in range(1))[0]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM