简体   繁体   English

为什么 Pandas 滚动意味着窗口居中

[英]Why does Pandas rolling mean centres the window

I want to create a graph of annual data and a 5 year moving average, comprising the current and previous 4 years values.我想创建一个年度数据图和一个 5 年移动平均值,包括当前和前 4 年的值。 However my 5 year window is centred and I cant figure out why.但是,我的 5 年窗口居中,我不知道为什么。 By that I mean the first moving average starts in the 3 year, and the final value is in the 3rd last year.我的意思是第一个移动平均线从 3 年开始,最终值在去年的第 3 年。 With my data, the moving average falls of a cliff because the final year is incomplete - I had anticipated dropping the final value also, but I cant figure out how to get the moving average to work as intended.根据我的数据,移动平均线下降了一个悬崖,因为最后一年是不完整的——我原以为最终值也会下降,但我无法弄清楚如何让移动平均线按预期工作。

My code is below我的代码在下面

#Plot historical revenue for context. Drop last year as it is incomplete
data=df_full.groupby('year').agg(Revenue=('price',sum)).reset_index()
data=data[:-1]
dataMA=df_full.groupby('year').agg(Revenue=('price',sum)).reset_index().rolling(5,center=False).mean()

fig=go.Figure()
fig.add_trace(go.Scatter(x=data.year, y=data.Revenue, name="Revenue"))
fig.add_trace(go.Scatter(x=dataMA.year, y=dataMA.Revenue, name="5 year Average"))
fig.update_layout(title="Annual Revenue 2001 to 2019",
                  xaxis_title="Year",
                  yaxis_title="Annual Revenue $")
fig.show()

I tried adding "centre=False", but this made no difference.我尝试添加“center=False”,但这没有区别。 The graph still looks like below.该图仍然如下所示。

在此处输入图片说明

See, it is supposed to work.看,它应该可以工作。 Since, I don't have your dataset how it looks — I created myself因为,我没有你的数据集看起来如何——我自己创建的

ser = pd.Series(np.random.randint(10,1000, 19), index=range(2001, 2020))

# Should look like this after your Group by
2001    578
2002    388
2003    803
2004    413
2005    125
2006    331
2007    179
2008    180
2009    331
2010    875
2011    422
2012    699
2013    256
2014    918
2015    566
2016    754
2017    521
2018    200
2019     16
dtype: int32

Now, doing the rolling:现在,做滚动:

ser.plot()
plt.ylim([0, df.max()])
ser.rolling(5, center=False).mean().plot()
plt.xticks(range(2000, 2020, 5));

The result is:结果是: 在此处输入图片说明

Now I think现在我想

You should get your data in the simple form like shown just above and store in the variable first — instead of stacking up all operations all together making one long line.您应该以上面所示的简单形式获取数据并首先存储在变量中 - 而不是将所有操作堆叠在一起形成一条长线。

Then try the same.然后尝试相同。 It should work.它应该工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM