简体   繁体   English

pandas 中的时间加权移动平均线

[英]Time weighted moving average in pandas

I would like to perform a time-weighted moving average in pandas that weights proportionately to how recent the observations were.我想在 pandas 中执行时间加权移动平均值,该平均值与最近的观察结果成比例。

Here is some sample data that I have.这是我拥有的一些示例数据。

dates = ['01/01/2021','02/01/2021','03/01/2021','04/01/2021','05/01/2021','06/01/2021']
swimmer1_place = ['1','1','4','3',np.nan,np.nan,]
swimmer2_place = [np.nan,'3','1',np.nan,'4','2']
swimmer3_place = ['2','2','3',np.nan,'3','1']

df = pd.DataFrame({'date':dates,'swimmer_1_place':swimmer1_place,'swimmer_2_place':swimmer2_place,'swimmer_3_place':swimmer3_place})
df['date'] = pd.to_datetime(df['date'])

在此处输入图像描述

What would be the best way to go about this?关于这个,go 的最佳方式是什么? I have tried using the built-in Pandas EWM method with limited success because that doesn't consider the varying time intervals between the different swimmers.我曾尝试使用内置的 Pandas EWM 方法,但效果有限,因为它没有考虑不同游泳者之间的不同时间间隔。

After converting your col values into integers:将 col 值转换为整数后:

def convert(x):
    if x is np.nan:
        return np.nan
    else: 
        return (int(x))
swimmer1_place=[convert(a) for a in swimmer1_place]
swimmer2_place=[convert(a) for a in swimmer2_place]
swimmer3_place=[convert(a) for a in swimmer3_place]

You can get a weighted average like the following (Of course, there might be other possible ways to do this):您可以获得如下加权平均值(当然,可能还有其他可能的方法可以做到这一点):

for column in df.columns:
    if column!="date":
        Array=(base_time-df['date'])[df[column].notna()].values
        SUM=np.sum(Array)
        AVG_array=Array/SUM
        column_vals=df[column][df[column].notna()].values
        result=AVG_array*column_vals
        df[column].loc[df[column].notna()]=result

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM