[英]Moving average in pandas with condition
I have a dataframe with the following structure:我有一个具有以下结构的 dataframe:
import numpy as np
import pandas as pd
df = pd.DataFrame(
{
"date": ["2020-01-01", "2020-01-02", "2020-01-03", "2020-01-04"] * 2,
"group": ["A", "A", "A", "A", "B", "B", "B", "B"],
"x": [1, 2, 2, 3, 2, 3, 4, 2],
"condition": [1, 0, 1, 0] * 2
}
)
df
I want to calculate, the rolling moving average of the last 3 days of the column x:我想计算列 x 的最后 3 天的滚动移动平均值:
condition = 1
.仅使用condition = 1
的滚动平均值的数据。The outcome should be the following:结果应如下所示:
How can I do that in pandas?我怎样才能在 pandas 中做到这一点? Thanks!谢谢!
Keep in mind it is not the same as this:请记住,它与此不同:
Rolling function in pandas with condition 在 pandas 中滚动 function 有条件
In here I'm looking for the moving average of the last 3 days, in the other one I just wanted the rolling average.在这里,我正在寻找过去 3 天的移动平均线,而在另一天,我只想要移动平均线。
First replace not matched rows by NaN
by Series.where
and then per groups shift values and call rolling method:首先用Series.where
用NaN
替换不匹配的行,然后按组移动值并调用滚动方法:
f = lambda x: x.shift().rolling(3, min_periods=1).mean()
df['roll'] = (df.assign(x = df['x'].where(df['condition'].eq(1)))
.groupby('group')['x']
.transform(f))
print (df)
date group x condition roll
0 2020-01-01 A 1 1 NaN
1 2020-01-02 A 2 0 1.0
2 2020-01-03 A 2 1 1.0
3 2020-01-04 A 3 0 1.5
4 2020-01-01 B 2 1 NaN
5 2020-01-02 B 3 0 2.0
6 2020-01-03 B 4 1 2.0
7 2020-01-04 B 2 0 3.0
Details :详情:
print (df.assign(x = df['x'].where(df['condition'].eq(1))))
date group x condition
0 2020-01-01 A 1.0 1
1 2020-01-02 A NaN 0
2 2020-01-03 A 2.0 1
3 2020-01-04 A NaN 0
4 2020-01-01 B 2.0 1
5 2020-01-02 B NaN 0
6 2020-01-03 B 4.0 1
7 2020-01-04 B NaN 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.