简体   繁体   English

python中的加权移动平均线

[英]Weighted moving average in python

I have data sampled at essentially random intervals. 我有基本随机间隔采样的数据。 I would like to compute a weighted moving average using numpy (or other python package). 我想用numpy(或其他python包)来计算加权移动平均线。 I have a crude implementation of a moving average, but I am having trouble finding a good way to do a weighted moving average, so that the values towards the center of the bin are weighted more than values towards the edges. 我有一个移动平均线的粗略实现,但我很难找到一个好的方法来进行加权移​​动平均线,因此朝向边框中心的值的加权大于边缘的值。

Here I generate some sample data and then take a moving average. 在这里,我生成一些样本数据,然后采用移动平均线。 How can I most easily implement a weighted moving average? 我怎样才能最轻松地实现加权移动平均线? Thanks! 谢谢!

import numpy as np
import matplotlib.pyplot as plt

#first generate some datapoint for a randomly sampled noisy sinewave
x = np.random.random(1000)*10
noise = np.random.normal(scale=0.3,size=len(x))
y = np.sin(x) + noise

#plot the data
plt.plot(x,y,'ro',alpha=0.3,ms=4,label='data')
plt.xlabel('Time')
plt.ylabel('Intensity')

#define a moving average function
def moving_average(x,y,step_size=.1,bin_size=1):
    bin_centers  = np.arange(np.min(x),np.max(x)-0.5*step_size,step_size)+0.5*step_size
    bin_avg = np.zeros(len(bin_centers))

    for index in range(0,len(bin_centers)):
        bin_center = bin_centers[index]
        items_in_bin = y[(x>(bin_center-bin_size*0.5) ) & (x<(bin_center+bin_size*0.5))]
        bin_avg[index] = np.mean(items_in_bin)

    return bin_centers,bin_avg

#plot the moving average
bins, average = moving_average(x,y)
plt.plot(bins, average,label='moving average')

plt.show()

The output: 输出: 数据和移动平均线

Using the advice from crs17 to use "weights=" in the np.average function, I came up weighted average function, which uses a Gaussian function to weight the data: 使用crs17的建议在np.average函数中使用“weights =”,我得到了加权平均函数,它使用高斯函数来加权数据:

def weighted_moving_average(x,y,step_size=0.05,width=1):
    bin_centers  = np.arange(np.min(x),np.max(x)-0.5*step_size,step_size)+0.5*step_size
    bin_avg = np.zeros(len(bin_centers))

    #We're going to weight with a Gaussian function
    def gaussian(x,amp=1,mean=0,sigma=1):
        return amp*np.exp(-(x-mean)**2/(2*sigma**2))

    for index in range(0,len(bin_centers)):
        bin_center = bin_centers[index]
        weights = gaussian(x,mean=bin_center,sigma=width)
        bin_avg[index] = np.average(y,weights=weights)

    return (bin_centers,bin_avg)

Results look good: 结果看起来不错: 使用numpy加权平均值

You could use numpy.average which allows you to specify weights: 您可以使用numpy.average来指定权重:

>>> bin_avg[index] = np.average(items_in_bin, weights=my_weights)

So to calculate the weights you could find the x coordinates of each data point in the bin and calculate their distances to the bin center. 因此,要计算权重,您可以找到仓中每个数据点的x坐标,并计算它们到仓中心的距离。

This won't give an exact solution, but it will make your life easier, and will probably be good enough... First, average your samples in small bins. 这不会给出确切的解决方案,但它会让您的生活更轻松,并且可能足够好......首先,将您的样品放在小容器中。 Once you have resampled your data to be equispaced, you can use stride tricks and np.average to do a weighted average: 一旦您重新采样数据为等间隔,您可以使用步幅技巧和np.average进行加权平均:

from numpy.lib.stride_tricks import as_strided

def moving_weighted_average(x, y, step_size=.1, steps_per_bin=10,
                            weights=None):
    # This ensures that all samples are within a bin
    number_of_bins = int(np.ceil(np.ptp(x) / step_size))
    bins = np.linspace(np.min(x), np.min(x) + step_size*number_of_bins,
                       num=number_of_bins+1)
    bins -= (bins[-1] - np.max(x)) / 2
    bin_centers = bins[:-steps_per_bin] + step_size*steps_per_bin/2

    counts, _ = np.histogram(x, bins=bins)
    vals, _ = np.histogram(x, bins=bins, weights=y)
    bin_avgs = vals / counts
    n = len(bin_avgs)
    windowed_bin_avgs = as_strided(bin_avgs,
                                   (n-steps_per_bin+1, steps_per_bin),
                                   bin_avgs.strides*2)

    weighted_average = np.average(windowed_bin_avgs, axis=1, weights=weights)

    return bin_centers, weighted_average

You can now do something like this: 你现在可以这样做:

#plot the moving average with triangular weights
weights = np.concatenate((np.arange(0, 5), np.arange(0, 5)[::-1]))
bins, average = moving_weighted_average(x, y, steps_per_bin=len(weights),
                                        weights=weights)
plt.plot(bins, average,label='moving average')

plt.show()

在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM