[英]Pandas Rolling Function is not working properly
I have the following DataFrame sample:我有以下 DataFrame 示例:
df = pd.DataFrame({'date':['2021-05-03','2021-05-10','2021-05-17','2021-05-24',
'2021-05-31','2021-06-07','2021-06-14','2021-06-21','2021-06-28','2021-07-05','2021-07-12','2021-07-19','2021-05-26'], 'spend':[1418,4130,4216,3374,3587,3665,4118,4534,4829,3156,2998,3025,3397]})
This is the code used:这是使用的代码:
df['spend avg'] = df['spend'].rolling(7).median()
This is the output that I got:这是我得到的输出:
df = pd.DataFrame({'date' : ['2021-05-03','2021-05-10','2021-05-17','2021-05-24',
'2021-05-31','2021-06-07','2021-06-14','2021-06-21','2021-06-28','2021-07-05','2021-07-12','2021-07-19','2021-05-26'], 'spend':[1418,4130,4216,3374,3587,3665,4118,4534,4829,3156,2998,3025,3397], 'spend_avg' :[np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,3665.0,4118.0,4118.0,3665.0,3665.0,3665.0,3397.0]})
As you can see, it is not calculating the average with the rolling averages (window = 7).如您所见,它不是使用滚动平均值计算平均值(窗口 = 7)。 I understand the NaNs are normal, but if you take a look at the values from the spend avg column, they are repeated from the spending column.我知道 NaN 是正常的,但是如果您查看支出平均列中的值,它们会从支出列中重复出现。
Why is this happening?为什么会这样? What am I doing wrong?我究竟做错了什么?
The desirable output would be:理想的输出是:
df = pd.DataFrame({'date' : ['2021-05-03','2021-05-10','2021-05-17','2021-05-24','2021-05-31','2021-06-07','2021-06-14','2021-06-21','2021-06-28','2021-07-05','2021-07-12','2021-07-19','2021-05-26'], 'spend_avg' :[np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,3501,3946,3894,3841,3665,3760,3722]})
Thanks!谢谢!
You want mean
not median
:你想要mean
不是median
:
In [667]: df.rolling(window=7).mean()
Out[667]:
spend
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 3501.142857
7 3946.285714
8 4046.142857
9 3894.714286
10 3841.000000
11 3760.714286
12 3722.428571
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.