简体   繁体   中英

Improving runtime of weighted moving average filter function?

I have a weighted moving average function which smooths a curve by averaging 3*width values to the left and to the right of each point using a gaussian weighting mechanism. I am only worried about smoothing a region bounded by [start, end]. The following code works, but the problem is runtime with large arrays.

import numpy as np
def weighted_moving_average(x, y, start, end, width = 3):
    def gaussian(x, a, m, s):
        return a*exp(-(x-m)**2/(2*s**2))
    cut = (x>=start-3*width)*(x<=end+3*width)
    x, y = x[cut], y[cut]
    x_avg = x[(x>=start)*(x<=end)]
    y_avg = np.zeros(len(x_avg))
    bin_vals = np.arange(-3*width,3*width+1)
    weights = gaussian(bin_vals, 1, 0, width)
    for i in range(len(x_avg)):
        y_vals = y[i:i+6*width+1]
        y_avg[i] = np.average(y_vals, weights = weights)
    return x_avg, y_avg

From my understanding, it is generally inefficient to loop through a NumPy array. I was wondering if anyone had an idea to replace the for loop with something more runtime efficient.

Thanks

That slicing and summing/averaging on a weighted window basically corresponds to 1D convolution with the kernel being flipped. Now, for 1D convolution , NumPy has a very efficient implementation in np.convolve and that could be used to get rid of the loop and give us y_avg . Thus, we would have a vectorized implementation like so -

y_sums = np.convolve(y,weights[::-1],'valid')
y_avg = np.true_divide(y_sums,weights.sum())

The main concern with looping over a large array is that the memory allocation for the large array can be expensive, and the whole thing has to be initialized before the loop can start.

In this particular case I'd go with what Divakar is saying.

In general, if you find yourself in a circumstance where you really need to iterate over a large collection, use iterators instead of arrays. For a relatively simple case like this, just replace range with xrange (see https://docs.python.org/2/library/functions.html#xrange ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM