[英]How to calculate rolling average over each product?
I have the first three columns in a dataframe in pandas.我在熊猫的数据框中有前三列。 I want to calculate the 3 days moving average with respect to each product as shown in the 4th column.
我想计算每个产品的 3 天移动平均值,如第 4 列所示。
Data数据
print (df)
Date Product Demand mov Avg
0 1-Jan-19 Product-01 3 NaN
1 2-Jan-19 Product-01 4 NaN
2 3-Jan-19 Product-01 5 4.0
3 4-Jan-19 Product-01 6 5.0
4 5-Jan-19 Product-01 7 6.0
5 3-Jan-19 Product-02 2 NaN
6 4-Jan-19 Product-02 3 NaN
7 5-Jan-19 Product-02 4 3.0
8 6-Jan-19 Product-02 5 4.0
9 7-Jan-19 Product-02 8 5.7
I tried using groupby and rolling mean but doesn't seem to work.我尝试使用 groupby 和滚动平均值,但似乎不起作用。
df['mov_avg'] =df.set_index('Date').groupby('Product').rolling('Demand',window=7).mean().reset_index(drop=True)
Use:用:
df['Date'] = pd.to_datetime(df['Date'], format='%d-%b-%y')
Your solution should be changed by rolling(3, freq='d')
:您的解决方案应该通过
rolling(3, freq='d')
来改变:
#sorting if not sorted DataFrame by both columns
df = df.sort_values(['Date','Product']).reset_index(drop=True)
df['mov_avg'] = (df.set_index('Date')
.groupby('Product')['Demand']
.rolling(3, freq='d')
.mean()
.reset_index(drop=True))
Another better solution is use DataFrame.join
:另一个更好的解决方案是使用
DataFrame.join
:
s = df.set_index('Date').groupby('Product')['Demand'].rolling(3, freq='d').mean()
df = df.join(s.rename('mov_avg'), on=['Product','Date'])
print (df)
Date Product Demand mov Avg mov_avg
0 2019-01-01 Product-01 3 NaN NaN
1 2019-01-02 Product-01 4 NaN NaN
2 2019-01-03 Product-01 5 4.0 4.000000
3 2019-01-04 Product-01 6 5.0 5.000000
4 2019-01-05 Product-01 7 6.0 6.000000
5 2019-01-03 Product-02 2 NaN NaN
6 2019-01-04 Product-02 3 NaN NaN
7 2019-01-05 Product-02 4 3.0 3.000000
8 2019-01-06 Product-02 5 4.0 4.000000
9 2019-01-07 Product-02 8 5.7 5.666667
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.