简体   繁体   English

通过熊猫的另一列的总和制作一个分组列

[英]Make a grouped column by sum of another column with pandas

I have this data set:我有这个数据集:

+===+=======+======+=======+=======+
|   | Group | Cost | Name1 | Name2 |
+===+=======+======+=======+=======+
| 0 | G1    | 1574 | N1A   | N2A   |
+---+-------+------+-------+-------+
| 1 | G2    | 1322 | N1B   | N2B   |
+---+-------+------+-------+-------+
| 2 | G3    | 1188 | N1C   | N2C   |
+---+-------+------+-------+-------+
| 3 | G3    |  942 | N1D   | N2D   |
+---+-------+------+-------+-------+
| 4 | G4    |  838 | N1E   | N2E   |
+---+-------+------+-------+-------+
| 5 | G5    |    5 | N1F   | N2F   |
+---+-------+------+-------+-------+
| 6 | G5    |    4 | N1F   | N2G   |
+---+-------+------+-------+-------+
| 7 | G5    |    3 | N1G   | N2H   |
+---+-------+------+-------+-------+

Now i want to group by "Group" and add a grouped column with the sum of column "Cost" for each group.现在我想按“组”分组并添加一个分组列,其中包含每个组的“成本”列的总和。 Dont know how to explain, so here is the expected result:不知道怎么解释,所以这是预期的结果:

+===+=======+======+======+=======+=======+
|   | Group | Sum  | Cost | Name1 | Name2 |
+===+=======+======+======+=======+=======+
| 0 | G1    | 1574 | 1574 | N1A   | N2A   |
+---+-------+------+------+-------+-------+
| 1 | G2    | 1322 | 1322 | N1B   | N2B   |
+---+-------+------+------+-------+-------+
| 2 | G3    | 2130 | 1188 | N1C   | N2C   |
|   |       |      +------+-------+-------+
|   |       |      |  942 | N1D   | N2D   |
+---+-------+------+------+-------+-------+
| 3 | G4    |  838 |  838 | N1E   | N2E   |
+---+-------+------+------+-------+-------+
| 4 | G5    |   12 |    5 | N1F   | N2F   |
|   |       |      +------+-------+-------+
|   |       |      |    4 | N1F   | N2G   |
|   |       |      +------+-------+-------+
|   |       |      |    3 | N1G   | N2H   |
+---+-------+------+------+-------+-------+

How can i achieve this with pandas?我怎样才能用熊猫实现这一目标? Is that even possible?这甚至可能吗? Sorry i am new to this stuff对不起,我是这个东西的新手

Use GroupBy.transform with sum and then for display your way create MultiIndex by DataFrame.set_index , but 'missing' values in MulitIndex are only not displaing:使用GroupBy.transformsum ,然后显示自己的方式建立MultiIndexDataFrame.set_index ,但'missing'的价值观MulitIndex只是不displaing:

df['Sum'] = df.groupby('Group')['Cost'].transform('sum')
df = df.set_index(['Group','Sum','Cost'])

Or:或者:

df1 = (df.assign(Sum = df.groupby('Group')['Cost'].transform('sum'))
         .set_index(['Group','Sum','Cost']))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM