熊猫分组并跨多列计数

Question

I have data ordered by ID, Year, and then a series of event flags indicating whether a thing did or did not happen for that ID in that year:我有按 ID、Year 排序的数据，然后是一系列事件标志，指示在那一年该 ID 是否发生了某件事：

ID ID	Year年	x X	y是	z z
1 1	2015 2015年	0 0	1 1	0 0
1 1	2016 2016年	1 1	1 1	0 0
1 1	2017 2017年	0 0	1 1	1 1
2 2	2015 2015年	1 1	0 0	1 1
2 2	2016 2016年	1 1	1 1	0 0
2 2	2017 2017年	0 0	1 1	1 1

I'd like to group by ID and Year and apply a cumulative count to each "event" column, such that I'm left with something like the following我想按 ID 和 Year 分组，并对每个“事件”列应用累积计数，这样我就会得到如下内容

ID ID	Year年	x_total x_total	y_total y_total	z_total z_total
1 1	2015 2015年	0 0	1 1	0 0
1 1	2016 2016年	1 1	2 2	0 0
1 1	2017 2017年	1 1	3 3	1 1
2 2	2015 2015年	1 1	0 0	1 1
2 2	2016 2016年	2 2	1 1	1 1
2 2	2017 2017年	2 2	2 2	2 2

I've looked at various options using cumsum and cumcount but I can't seem to figure this out.我已经使用cumsum和cumcount查看了各种选项，但我似乎无法弄清楚这一点。

Answer 1

You can use .groupby() + .cumsum() to get the cumulative count to each "event" column.您可以使用.groupby() + .cumsum()来获取每个“事件”列的累积计数。 Then add _total as suffix to the column names by .add_suffix() and then join with the first 2 columns:然后通过.add_suffix()将_total作为后缀添加到列名，然后加入前两列：

df[['ID', 'Year']].join(df.groupby('ID')[['x', 'y', 'z']].cumsum().add_suffix('_total'))

Result:结果：

   ID  Year  x_total  y_total  z_total
0   1  2015        0        1        0
1   1  2016        1        2        0
2   1  2017        1        3        1
3   2  2015        1        0        1
4   2  2016        2        1        1
5   2  2017        2        2        2

熊猫分组并跨多列计数

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-07-19 17:33:18

熊猫分组并跨多列计数

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-07-19 17:33:18

解决方案1
0 已采纳 2021-07-19 17:33:18