简体   繁体   中英

Python - Optimize Lambda with Numpy Operations

I'm having difficult time optimizing the following calculation;

Inner_diff_grp  = np.var(list(map(lambda x : np.percentile(winw2_grp,x[0]) - np.percentile(winw2_grp,x[1])  ,[(i+7,i) for i in range(0,98,7)])))

'winw2_grp' is a small sized image array (say 5x5). I'm looping though the image to find percentile values at every 7th step and then calculating the variance of those values.

Total images in the loop are around 100,000. Earlier I was using standard loops but now I've changed that to Pandas.apply that seems to be performing better and throughput is around 150 iteration/sec now - which still means more than 10mins of runtime.

Appart from trying out pooling to exploit all CPUs, is there any way to optimize this calculation?

So as per the sugession from @Ehsan, I enclosed the calculation in a separate function with the numba decorator and that was it. I deliberately removed the lambda because I wanted to try out other optimizations (Parallel execution) - so its not part of the strategy but rather a WIP.

@nb.jit(nopython = True, fastmath=True)
def numba_perc_calc(win):
    arr = [0, 7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, 84, 91, 98]
    perc = np.percentile(win,arr)
    dif = np.diff(perc )
    var_of_percs = np.var(dif )
    return var_of_percs

The result of a piece on a smaller test-set is followed.

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM