[英]Make a grouped column by sum of another column with pandas
I have this data set:我有这个数据集:
+===+=======+======+=======+=======+
| | Group | Cost | Name1 | Name2 |
+===+=======+======+=======+=======+
| 0 | G1 | 1574 | N1A | N2A |
+---+-------+------+-------+-------+
| 1 | G2 | 1322 | N1B | N2B |
+---+-------+------+-------+-------+
| 2 | G3 | 1188 | N1C | N2C |
+---+-------+------+-------+-------+
| 3 | G3 | 942 | N1D | N2D |
+---+-------+------+-------+-------+
| 4 | G4 | 838 | N1E | N2E |
+---+-------+------+-------+-------+
| 5 | G5 | 5 | N1F | N2F |
+---+-------+------+-------+-------+
| 6 | G5 | 4 | N1F | N2G |
+---+-------+------+-------+-------+
| 7 | G5 | 3 | N1G | N2H |
+---+-------+------+-------+-------+
Now i want to group by "Group" and add a grouped column with the sum of column "Cost" for each group.现在我想按“组”分组并添加一个分组列,其中包含每个组的“成本”列的总和。 Dont know how to explain, so here is the expected result:
不知道怎么解释,所以这是预期的结果:
+===+=======+======+======+=======+=======+
| | Group | Sum | Cost | Name1 | Name2 |
+===+=======+======+======+=======+=======+
| 0 | G1 | 1574 | 1574 | N1A | N2A |
+---+-------+------+------+-------+-------+
| 1 | G2 | 1322 | 1322 | N1B | N2B |
+---+-------+------+------+-------+-------+
| 2 | G3 | 2130 | 1188 | N1C | N2C |
| | | +------+-------+-------+
| | | | 942 | N1D | N2D |
+---+-------+------+------+-------+-------+
| 3 | G4 | 838 | 838 | N1E | N2E |
+---+-------+------+------+-------+-------+
| 4 | G5 | 12 | 5 | N1F | N2F |
| | | +------+-------+-------+
| | | | 4 | N1F | N2G |
| | | +------+-------+-------+
| | | | 3 | N1G | N2H |
+---+-------+------+------+-------+-------+
How can i achieve this with pandas?我怎样才能用熊猫实现这一目标? Is that even possible?
这甚至可能吗? Sorry i am new to this stuff
对不起,我是这个东西的新手
Use GroupBy.transform
with sum
and then for display your way create MultiIndex
by DataFrame.set_index
, but 'missing'
values in MulitIndex
are only not displaing:使用
GroupBy.transform
与sum
,然后显示自己的方式建立MultiIndex
由DataFrame.set_index
,但'missing'
的价值观MulitIndex
只是不displaing:
df['Sum'] = df.groupby('Group')['Cost'].transform('sum')
df = df.set_index(['Group','Sum','Cost'])
Or:或者:
df1 = (df.assign(Sum = df.groupby('Group')['Cost'].transform('sum'))
.set_index(['Group','Sum','Cost']))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.