[英]Pandas dataframe rolling sum column with groupby
I'm trying to create a new column that gives a rolling sum of values in the Values
column.我正在尝试创建一个新列,该列在
Values
列中提供滚动的值总和。 The rolling sum includes 4 rows ie the current row and the next three rows.滚动总和包括 4 行,即当前行和接下来的三行。 I want to do this for each type in the 'Type' column.
我想为“类型”列中的每种类型执行此操作。
However, if there are fewer than 4 rows before the next type starts, I want the rolling sum to use only the remaining rows.但是,如果在下一个类型开始之前少于 4 行,我希望滚动总和仅使用剩余的行。 For example, if there are 2 rows after the current row for the current type, a total of 3 rows is used for the rolling sum.
例如,如果当前类型的当前行之后有 2 行,则总共使用 3 行作为滚动和。 See the table below showing what I'm currently getting and what I expect.
请参阅下表,显示我目前得到的和我的期望。
Index![]() |
Type![]() |
Value![]() |
Current Rolling Sum![]() |
Expected Rolling Sum![]() |
---|---|---|---|---|
1 ![]() |
left![]() |
5 ![]() |
22 ![]() |
22 ![]() |
2 ![]() |
left![]() |
9 ![]() |
34 ![]() |
34 ![]() |
3 ![]() |
left![]() |
0 ![]() |
NaN ![]() |
25 ![]() |
4 ![]() |
left![]() |
8 ![]() |
NaN ![]() |
25 ![]() |
5 ![]() |
left![]() |
17 ![]() |
NaN ![]() |
17 ![]() |
6 ![]() |
straight![]() |
7 ![]() |
61 ![]() |
61 ![]() |
7 ![]() |
straight![]() |
4 ![]() |
77 ![]() |
77 ![]() |
8 ![]() |
straight![]() |
0 ![]() |
86 ![]() |
86 ![]() |
9 ![]() |
straight![]() |
50 ![]() |
97 ![]() |
97 ![]() |
10 ![]() |
straight![]() |
23 ![]() |
NaN ![]() |
47 ![]() |
11 ![]() |
straight![]() |
13 ![]() |
NaN ![]() |
24 ![]() |
12 ![]() |
straight![]() |
11 ![]() |
NaN ![]() |
11 ![]() |
The following line of code is what I'm currently using to get the rolling sum.以下代码行是我目前用于获取滚动总和的代码。
rolling_sum = df.groupby('Type', sort=False)['Value'].rolling(4, min_periods = 3).sum().shift(-3).reset_index()
rolling_sum = rolling_sum.rename(columns={'Value': 'Rolling Sum'})
extracted_col = rolling_sum['Rolling Sum']
df = df.join(extracted_col)
I would really appreciate your help.我将衷心感谢您的帮助。
You can try running the rolling sum on the reversed values for each group and then reverse back afterward, using a min_periods
of 1:您可以尝试对每个组的反向值运行滚动总和,然后使用
min_periods
为 1 反向返回:
df['Rolling Sum'] = df.groupby('Type', sort=False)['Value'].apply(lambda x: x[::-1].rolling(4, min_periods=1).sum()[::-1])
Result:结果:
Index Type Value Rolling Sum
0 1 left 5 22.0
1 2 left 9 34.0
2 3 left 0 25.0
3 4 left 8 25.0
4 5 left 17 17.0
5 6 straight 7 61.0
6 7 straight 4 77.0
7 8 straight 0 86.0
8 9 straight 50 97.0
9 10 straight 23 47.0
10 11 straight 13 24.0
11 12 straight 11 11.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.