根據列值對 dataframe 的子集進行計算

Question

我有一個 pandas df，其中我有一個分類列，然后是各種數字列 - 我需要計算各種值，但在類別相同的 df 子集上：

Type | num1 | num2
 a   | 10   | 10 
 a   | 5    | 10
 a   | 1    | 30 
 b   | 5    | 10
...

在這里，我想計算每個值占該類型總數的百分比

所以 output 將是：

Type | num1 | num2 | num2_pct
 a   | 10   | 10   | 20
 a   | 5    | 10   | 20
 a   | 1    | 30   | 60
...

將對type列中的每個值進行此計算。

我曾嘗試使用df.loc並編寫一個循環，創建一個新的 DF 然后合並它們——但這不是正確的方法！

Answer 1

您可以單獨運行它，或使用 pipe 來獲得結果：

#pipe

df["num1_pct"] = (df.groupby("Type")
                    .pipe(lambda x: x.num2.transform(lambda x: x).div(x.num2.transform("sum")).mul(100)))

       Type num1    num2    num1_pct
   0    a   10      10      20.0
   1    a   5       10      20.0
   2    a   1       30      60.0
   3    b   5      10       100.0

#individually, and in my own opinion, cleaner : 

grouping = df.groupby("Type")

df["num2_pct"] = df.num2 * 100 / grouping.num2.transform("sum")

根據列值對 dataframe 的子集進行計算

問題描述

1 個解決方案

解決方案1
1 已采納 2020-08-05 12:43:23

根據列值對 dataframe 的子集進行計算

問題描述

1 個解決方案

解決方案1 1 已采納 2020-08-05 12:43:23

解決方案1
1 已采納 2020-08-05 12:43:23