简体   繁体   English

改善加权移动平均滤波器功能的运行时间?

[英]Improving runtime of weighted moving average filter function?

I have a weighted moving average function which smooths a curve by averaging 3*width values to the left and to the right of each point using a gaussian weighting mechanism. 我有一个加权移动平均值函数,该函数通过使用高斯加权机制在每个点的左侧和右侧对3 *宽度值求平均来平滑曲线。 I am only worried about smoothing a region bounded by [start, end]. 我只担心平滑[开始,结束]边界的区域。 The following code works, but the problem is runtime with large arrays. 以下代码有效,但问题是大型数组在运行时。

import numpy as np
def weighted_moving_average(x, y, start, end, width = 3):
    def gaussian(x, a, m, s):
        return a*exp(-(x-m)**2/(2*s**2))
    cut = (x>=start-3*width)*(x<=end+3*width)
    x, y = x[cut], y[cut]
    x_avg = x[(x>=start)*(x<=end)]
    y_avg = np.zeros(len(x_avg))
    bin_vals = np.arange(-3*width,3*width+1)
    weights = gaussian(bin_vals, 1, 0, width)
    for i in range(len(x_avg)):
        y_vals = y[i:i+6*width+1]
        y_avg[i] = np.average(y_vals, weights = weights)
    return x_avg, y_avg

From my understanding, it is generally inefficient to loop through a NumPy array. 据我了解,循环遍历NumPy数组通常效率不高。 I was wondering if anyone had an idea to replace the for loop with something more runtime efficient. 我想知道是否有人想用更有效的运行时替换for循环。

Thanks 谢谢

That slicing and summing/averaging on a weighted window basically corresponds to 1D convolution with the kernel being flipped. 加权窗口上的切片和求和/平均基本上对应于内核被翻转的一维卷积。 Now, for 1D convolution , NumPy has a very efficient implementation in np.convolve and that could be used to get rid of the loop and give us y_avg . 现在,对于1D卷积 ,NumPy在np.convolve有一个非常有效的实现,可以用来摆脱循环并给我们y_avg Thus, we would have a vectorized implementation like so - 因此,我们将有一个矢量化的实现,如下所示:

y_sums = np.convolve(y,weights[::-1],'valid')
y_avg = np.true_divide(y_sums,weights.sum())

The main concern with looping over a large array is that the memory allocation for the large array can be expensive, and the whole thing has to be initialized before the loop can start. 遍历大型数组的主要问题是大型数组的内存分配可能很昂贵,并且必须在循环开始之前初始化整个过程。

In this particular case I'd go with what Divakar is saying. 在这种情况下,我会同意Divakar的说法。

In general, if you find yourself in a circumstance where you really need to iterate over a large collection, use iterators instead of arrays. 通常,如果您发现自己确实需要迭代大型集合,请使用迭代器而不是数组。 For a relatively simple case like this, just replace range with xrange (see https://docs.python.org/2/library/functions.html#xrange ). 对于像这样的相对简单的情况,只需将range替换为xrange (请参阅https://docs.python.org/2/library/functions.html#xrange )。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM