Pandas 按兩列分組，並按每組計算第二列值

Question

我有一個域數據集，有人可以告訴我如何使用 Pandas 過濾具有多個擴展名的域。

我按此代碼對其進行了分組，但得到了以下結果：

dfActive.groupby(['domain','ext'])['ext'].nunique()

結果：

domain         com     1
sample         com     1
mashhadmap     com     1
               net     1

預期結果：

mashhadmap     2

Answer 1

如果需要按sum計算每個第一級domain ，IIUC 使用：

dfActive.groupby(['domain','ext'])['ext'].nunique().groupby(level=0).sum()

如果每個第一級重復，則需要過濾器值：

s = dfActive.groupby(['domain','ext'])['ext'].nunique()
s = s[s.index.get_level_values(0).duplicated(keep=False)]

#and then if need aggregate sum
out = s.groupby(level=0).sum()

Pandas 按兩列分組，並按每組計算第二列值

問題描述

1 個解決方案

解決方案1
1 已采納 2022-07-06 06:52:41

Pandas 按兩列分組，並按每組計算第二列值

問題描述

1 個解決方案

解決方案1 1 已采納 2022-07-06 06:52:41

解決方案1
1 已采納 2022-07-06 06:52:41