简体   繁体   English

我如何优化 Python 循环并行化/多处理

[英]How do i optimise a Python loop parallelize/multiprocessing

I have to loop for N times to calculate formulas and add results in dataframe. My code works and takes a few seconds to process each Item.我必须循环 N 次才能计算公式并将结果添加到 dataframe 中。我的代码有效并且需要几秒钟来处理每个项目。 However, it can only do one item at a time because I'm running the array through a for loop:但是,它一次只能执行一项,因为我正在通过 for 循环运行数组:

I try to update Code and I add numba library to optimise code我尝试更新代码并添加 numba 库来优化代码

def calculationResults(myconfig,df_results,isvalid,dimension,....othersparams):
    for month in nb.prange(0, myconfig.len_production):   
        calculationbymonth(month,df_results,,....othersparams)
    return df_results

But it's still doing one item at a time?但它仍然一次做一个项目? ANy Ideas?有任何想法吗?

We can use parallelized apply using the similar to below function.我们可以使用类似于以下 function 的并行应用。

def parallelize_dataframe(df, func, n_cores=4):
    df_split = np.array_split(df, n_cores)
    pool = Pool(n_cores)
    df = pd.concat(pool.map(func, df_split))
    pool.close()
    pool.join()
    return df

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM