[英]Make a grouped column by sum of another column with pandas
我有這個數據集:
+===+=======+======+=======+=======+
| | Group | Cost | Name1 | Name2 |
+===+=======+======+=======+=======+
| 0 | G1 | 1574 | N1A | N2A |
+---+-------+------+-------+-------+
| 1 | G2 | 1322 | N1B | N2B |
+---+-------+------+-------+-------+
| 2 | G3 | 1188 | N1C | N2C |
+---+-------+------+-------+-------+
| 3 | G3 | 942 | N1D | N2D |
+---+-------+------+-------+-------+
| 4 | G4 | 838 | N1E | N2E |
+---+-------+------+-------+-------+
| 5 | G5 | 5 | N1F | N2F |
+---+-------+------+-------+-------+
| 6 | G5 | 4 | N1F | N2G |
+---+-------+------+-------+-------+
| 7 | G5 | 3 | N1G | N2H |
+---+-------+------+-------+-------+
現在我想按“組”分組並添加一個分組列,其中包含每個組的“成本”列的總和。 不知道怎么解釋,所以這是預期的結果:
+===+=======+======+======+=======+=======+
| | Group | Sum | Cost | Name1 | Name2 |
+===+=======+======+======+=======+=======+
| 0 | G1 | 1574 | 1574 | N1A | N2A |
+---+-------+------+------+-------+-------+
| 1 | G2 | 1322 | 1322 | N1B | N2B |
+---+-------+------+------+-------+-------+
| 2 | G3 | 2130 | 1188 | N1C | N2C |
| | | +------+-------+-------+
| | | | 942 | N1D | N2D |
+---+-------+------+------+-------+-------+
| 3 | G4 | 838 | 838 | N1E | N2E |
+---+-------+------+------+-------+-------+
| 4 | G5 | 12 | 5 | N1F | N2F |
| | | +------+-------+-------+
| | | | 4 | N1F | N2G |
| | | +------+-------+-------+
| | | | 3 | N1G | N2H |
+---+-------+------+------+-------+-------+
我怎樣才能用熊貓實現這一目標? 這甚至可能嗎? 對不起,我是這個東西的新手
使用GroupBy.transform
與sum
,然后顯示自己的方式建立MultiIndex
由DataFrame.set_index
,但'missing'
的價值觀MulitIndex
只是不displaing:
df['Sum'] = df.groupby('Group')['Cost'].transform('sum')
df = df.set_index(['Group','Sum','Cost'])
或者:
df1 = (df.assign(Sum = df.groupby('Group')['Cost'].transform('sum'))
.set_index(['Group','Sum','Cost']))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.