繁体   English   中英

如何根据其他列上的值包含 Pandas 的移动平均线

[英]How to include Moving Average with Pandas based on Values on other Columns

我正在尝试计算以下数据帧的移动平均线,但我无法将结果连接回数据帧
数据框是:(移动平均值显示在括号中)

Key1 Key2 Value MovingAverage  
  1    2    1       (Nan)
  1    7    2       (Nan)
  1    8    3       (Nan)
  2    5    1       (Nan)
  2    3    2       (Nan)
  2    2    3       (Nan)
  3    7    1       (Nan)
  3    5    2       (Nan)
  3    8    3       (Nan)
  4    7    1       (1.33)
  4    2    2        (2)
  4    9    3       (Nan)
  5    8    1       (2.33)
  5    3    2       (Nan)
  5    9    3       (Nan)
  6    2    1        (2)
  6    7    2       (1.33)
  6    9    3        (3)

代码是:

import pandas as pd
d = {'Key1':[1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6], 'Key2':[2,7,8,5,3,2,7,5,8,7,2,9,8,3,9,2,7,9],'Value':[1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3]}
df = pd.DataFrame(d)
print(df)
MaDf = df.groupby(['Key2'])['Value'].rolling(window=3).mean().to_frame('mean')
print (MaDf) 

如果您运行代码,它将根据“Key2”和“Value”正确计算移动平均线,但我找不到将其正确重新插入原始数据帧 (df) 的方法

删除第一级MultiIndex by Series.reset_indexdrop=True对齐第二级:

df['mean'] = (df.groupby('Key2')['Value']
                .rolling(window=3)
                .mean()
                .reset_index(level=0, drop=True))
print (df)
    Key1  Key2  Value      mean
0      1     2      1       NaN
1      1     7      2       NaN
2      1     8      3       NaN
3      2     5      1       NaN
4      2     3      2       NaN
5      2     2      3       NaN
6      3     7      1       NaN
7      3     5      2       NaN
8      3     8      3       NaN
9      4     7      1  1.333333
10     4     2      2  2.000000
11     4     9      3       NaN
12     5     8      1  2.333333
13     5     3      2       NaN
14     5     9      3       NaN
15     6     2      1  2.000000
16     6     7      2  1.333333
17     6     9      3  3.000000

如果可以使用默认RangeIndex ,请使用Series.sort_index

df['mean'] = (df.groupby(['Key2'])['Value']
                .rolling(window=3)
                .mean()
                .sort_index(level=1)
                .values)
print (df)
    Key1  Key2  Value      mean
0      1     2      1       NaN
1      1     7      2       NaN
2      1     8      3       NaN
3      2     5      1       NaN
4      2     3      2       NaN
5      2     2      3       NaN
6      3     7      1       NaN
7      3     5      2       NaN
8      3     8      3       NaN
9      4     7      1  1.333333
10     4     2      2  2.000000
11     4     9      3       NaN
12     5     8      1  2.333333
13     5     3      2       NaN
14     5     9      3       NaN
15     6     2      1  2.000000
16     6     7      2  1.333333
17     6     9      3  3.000000

简单的df['mean'] = df.groupby(['Key2'])['Value'].rolling(window=3).mean().values

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM