简体   繁体   中英

How do i optimise a Python loop parallelize/multiprocessing

I have to loop for N times to calculate formulas and add results in dataframe. My code works and takes a few seconds to process each Item. However, it can only do one item at a time because I'm running the array through a for loop:

I try to update Code and I add numba library to optimise code

def calculationResults(myconfig,df_results,isvalid,dimension,....othersparams):
    for month in nb.prange(0, myconfig.len_production):   
        calculationbymonth(month,df_results,,....othersparams)
    return df_results

But it's still doing one item at a time? ANy Ideas?

We can use parallelized apply using the similar to below function.

def parallelize_dataframe(df, func, n_cores=4):
    df_split = np.array_split(df, n_cores)
    pool = Pool(n_cores)
    df = pd.concat(pool.map(func, df_split))
    pool.close()
    pool.join()
    return df

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM