带有groupby的熊猫数据框滚动总和列

Question

I'm trying to create a new column that gives a rolling sum of values in the Values column.我正在尝试创建一个新列，该列在Values列中提供滚动的值总和。 The rolling sum includes 4 rows ie the current row and the next three rows.滚动总和包括 4 行，即当前行和接下来的三行。 I want to do this for each type in the 'Type' column.我想为“类型”列中的每种类型执行此操作。

However, if there are fewer than 4 rows before the next type starts, I want the rolling sum to use only the remaining rows.但是，如果在下一个类型开始之前少于 4 行，我希望滚动总和仅使用剩余的行。 For example, if there are 2 rows after the current row for the current type, a total of 3 rows is used for the rolling sum.例如，如果当前类型的当前行之后有 2 行，则总共使用 3 行作为滚动和。 See the table below showing what I'm currently getting and what I expect.请参阅下表，显示我目前得到的和我的期望。

Index指数	Type类型	Value价值	Current Rolling Sum当前滚动总和	Expected Rolling Sum预期滚动总和
1 1	left剩下	5 5	22 22	22 22
2 2	left剩下	9 9	34 34	34 34
3 3	left剩下	0 0	NaN NaN	25 25
4 4	left剩下	8 8	NaN NaN	25 25
5 5	left剩下	17 17	NaN NaN	17 17
6 6	straight直的	7 7	61 61	61 61
7 7	straight直的	4 4	77 77	77 77
8 8	straight直的	0 0	86 86	86 86
9 9	straight直的	50 50	97 97	97 97
10 10	straight直的	23 23	NaN NaN	47 47
11 11	straight直的	13 13	NaN NaN	24 24
12 12	straight直的	11 11	NaN NaN	11 11

The following line of code is what I'm currently using to get the rolling sum.以下代码行是我目前用于获取滚动总和的代码。

rolling_sum = df.groupby('Type', sort=False)['Value'].rolling(4, min_periods = 3).sum().shift(-3).reset_index()
rolling_sum = rolling_sum.rename(columns={'Value': 'Rolling Sum'})

extracted_col = rolling_sum['Rolling Sum']
df = df.join(extracted_col)

I would really appreciate your help.我将衷心感谢您的帮助。

Answer 1

You can try running the rolling sum on the reversed values for each group and then reverse back afterward, using a min_periods of 1:您可以尝试对每个组的反向值运行滚动总和，然后使用min_periods为 1 反向返回：

df['Rolling Sum'] = df.groupby('Type', sort=False)['Value'].apply(lambda x: x[::-1].rolling(4, min_periods=1).sum()[::-1])

Result:结果：

   Index        Type    Value   Rolling Sum
0      1        left        5          22.0
1      2        left        9          34.0
2      3        left        0          25.0
3      4        left        8          25.0
4      5        left       17          17.0
5      6    straight        7          61.0
6      7    straight        4          77.0
7      8    straight        0          86.0
8      9    straight       50          97.0
9     10    straight       23          47.0
10    11    straight       13          24.0
11    12    straight       11          11.0

带有groupby的熊猫数据框滚动总和列

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-07-13 07:08:58

带有groupby的熊猫数据框滚动总和列

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-07-13 07:08:58

解决方案1
0 已采纳 2021-07-13 07:08:58