[英]Pandas Groupby columns and get a frequency of 0
我有一個 dataframe,我想按 Col1 Col2 Col3 分組並獲得 Value 列的 0 頻率:df =
Col1 Col2 Col3 Value
Val1 Val2 A 0
Val1 Val2 A 1
Val1 Val2 A 2
Val1 Val2 A 0
Val1 Val2 A 1
Val1 Val2 B 0
Val1 Val2 B 0
Val1 Val2 B 0
Val1 Val2 B 0
Val1 Val2 B 1
...
如何應用 groupby 來實現
Col1 Col2 Col3 Fercentage_of_0
Val1 Val2 A 0.2
Val1 Val2 B 0.8
...
謝謝!
一個簡單的lambda
function 為您完成。 生成一個列表,其中Value==0
,獲取此列表的 len 和組中的項目 len。 你有百分比
df = pd.DataFrame({"Col1":["Val1","Val1","Val1","Val1","Val1","Val1","Val1","Val1","Val1","Val1"],"Col2":["Val2","Val2","Val2","Val2","Val2","Val2","Val2","Val2","Val2","Val2"],"Col3":["A","A","A","A","A","B","B","B","B","B"],"Value":[0,1,2,0,1,0,0,0,0,1]})
df.groupby(["Col1","Col2","Col3"]).\
agg({"Value":lambda x: len([v for v in x if v==0])/len(x)})
output
Value
Col1 Col2 Col3
Val1 Val2 A 0.4
B 0.8
在 dataframe 上使用 groupby,然后對生成的 dataframe 應用 size() 方法。 例如,假設您創建了一個名為 df 的 dataframe 包含這些值
df = pd.DataFrame({'Col1': ['Val1','Val1','Val1','Val1','Val1','Val1','Val1','Val1'],
'Col2': ['Val2','Val2','Val2','Val2','Val2','Val2','Val2','Val2'],
'Col3': ['A','A','A','A','B','B','B','B'],
'Value':[0,1,2,0,0,0,0,1]})
然后可以使用找到單個元素的頻率計數
df.groupby(['Col1','Col2','Col3','Value']).size()
Col1 Col2 Col3 Value
Val1 Val2 A 0 2
1 1
2 1
B 0 3
1 1
dtype: int64
這是不使用 lambda 的另一種方法,這對我來說似乎更容易理解:
df['is_zero'] = df['Value'] == 0
df.groupby(['Col1', 'Col2', 'Col3'])['is_zero'].mean()
為Value
等於 0 創建 boolean 列,並在Col
列上進行 groupby
(
df.assign(Percentage_Of_0=lambda x: x.Value.eq(0))
.groupby(["Col1", "Col2", "Col3"], as_index=False)
.Percentage_Of_0.mean()
)
Col1 Col2 Col3 Percentage_Of_0
0 Val1 Val2 A 0.4
1 Val1 Val2 B 0.8
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.