python中的加权移动平均线

Question

I have data sampled at essentially random intervals. 我有基本随机间隔采样的数据。 I would like to compute a weighted moving average using numpy (or other python package). 我想用numpy（或其他python包）来计算加权移动平均线。 I have a crude implementation of a moving average, but I am having trouble finding a good way to do a weighted moving average, so that the values towards the center of the bin are weighted more than values towards the edges. 我有一个移动平均线的粗略实现，但我很难找到一个好的方法来进行加权移动平均线，因此朝向边框中心的值的加权大于边缘的值。

Here I generate some sample data and then take a moving average. 在这里，我生成一些样本数据，然后采用移动平均线。 How can I most easily implement a weighted moving average? 我怎样才能最轻松地实现加权移动平均线？ Thanks! 谢谢！

import numpy as np
import matplotlib.pyplot as plt

#first generate some datapoint for a randomly sampled noisy sinewave
x = np.random.random(1000)*10
noise = np.random.normal(scale=0.3,size=len(x))
y = np.sin(x) + noise

#plot the data
plt.plot(x,y,'ro',alpha=0.3,ms=4,label='data')
plt.xlabel('Time')
plt.ylabel('Intensity')

#define a moving average function
def moving_average(x,y,step_size=.1,bin_size=1):
    bin_centers  = np.arange(np.min(x),np.max(x)-0.5*step_size,step_size)+0.5*step_size
    bin_avg = np.zeros(len(bin_centers))

    for index in range(0,len(bin_centers)):
        bin_center = bin_centers[index]
        items_in_bin = y[(x>(bin_center-bin_size*0.5) ) & (x<(bin_center+bin_size*0.5))]
        bin_avg[index] = np.mean(items_in_bin)

    return bin_centers,bin_avg

#plot the moving average
bins, average = moving_average(x,y)
plt.plot(bins, average,label='moving average')

plt.show()

The output: 输出： 数据和移动平均线

Using the advice from crs17 to use "weights=" in the np.average function, I came up weighted average function, which uses a Gaussian function to weight the data: 使用crs17的建议在np.average函数中使用“weights =”，我得到了加权平均函数，它使用高斯函数来加权数据：

def weighted_moving_average(x,y,step_size=0.05,width=1):
    bin_centers  = np.arange(np.min(x),np.max(x)-0.5*step_size,step_size)+0.5*step_size
    bin_avg = np.zeros(len(bin_centers))

    #We're going to weight with a Gaussian function
    def gaussian(x,amp=1,mean=0,sigma=1):
        return amp*np.exp(-(x-mean)**2/(2*sigma**2))

    for index in range(0,len(bin_centers)):
        bin_center = bin_centers[index]
        weights = gaussian(x,mean=bin_center,sigma=width)
        bin_avg[index] = np.average(y,weights=weights)

    return (bin_centers,bin_avg)

Results look good: 结果看起来不错： 使用numpy加权平均值

Answer 1

You could use numpy.average which allows you to specify weights: 您可以使用numpy.average来指定权重：

>>> bin_avg[index] = np.average(items_in_bin, weights=my_weights)

So to calculate the weights you could find the x coordinates of each data point in the bin and calculate their distances to the bin center. 因此，要计算权重，您可以找到仓中每个数据点的x坐标，并计算它们到仓中心的距离。

Answer 2

This won't give an exact solution, but it will make your life easier, and will probably be good enough... First, average your samples in small bins. 这不会给出确切的解决方案，但它会让您的生活更轻松，并且可能足够好......首先，将您的样品放在小容器中。 Once you have resampled your data to be equispaced, you can use stride tricks and np.average to do a weighted average: 一旦您重新采样数据为等间隔，您可以使用步幅技巧和np.average进行加权平均：

from numpy.lib.stride_tricks import as_strided

def moving_weighted_average(x, y, step_size=.1, steps_per_bin=10,
                            weights=None):
    # This ensures that all samples are within a bin
    number_of_bins = int(np.ceil(np.ptp(x) / step_size))
    bins = np.linspace(np.min(x), np.min(x) + step_size*number_of_bins,
                       num=number_of_bins+1)
    bins -= (bins[-1] - np.max(x)) / 2
    bin_centers = bins[:-steps_per_bin] + step_size*steps_per_bin/2

    counts, _ = np.histogram(x, bins=bins)
    vals, _ = np.histogram(x, bins=bins, weights=y)
    bin_avgs = vals / counts
    n = len(bin_avgs)
    windowed_bin_avgs = as_strided(bin_avgs,
                                   (n-steps_per_bin+1, steps_per_bin),
                                   bin_avgs.strides*2)

    weighted_average = np.average(windowed_bin_avgs, axis=1, weights=weights)

    return bin_centers, weighted_average

You can now do something like this: 你现在可以这样做：

#plot the moving average with triangular weights
weights = np.concatenate((np.arange(0, 5), np.arange(0, 5)[::-1]))
bins, average = moving_weighted_average(x, y, steps_per_bin=len(weights),
                                        weights=weights)
plt.plot(bins, average,label='moving average')

plt.show()

在此输入图像描述

python中的加权移动平均线

问题描述

2 个解决方案

解决方案1
7 已采纳 2013-08-29 18:34:05

解决方案2
4 2013-08-29 18:51:15

python中的加权移动平均线

问题描述

2 个解决方案

解决方案1 7 已采纳 2013-08-29 18:34:05

解决方案2 4 2013-08-29 18:51:15

解决方案1
7 已采纳 2013-08-29 18:34:05

解决方案2
4 2013-08-29 18:51:15