Pandas group by two columns and count the second column value by each group

Question

I have a dataset of domains could someone tell me how I can filter domains with more than one extension with Pandas.

I grouped it by this code but I got this result:

dfActive.groupby(['domain','ext'])['ext'].nunique()

Result:

domain         com     1
sample         com     1
mashhadmap     com     1
               net     1

Expected Result:

mashhadmap     2

Answer 1

IIUC use if need count per first level domain by aggregate sum :

dfActive.groupby(['domain','ext'])['ext'].nunique().groupby(level=0).sum()

If need filter values if duplicated per first level:

s = dfActive.groupby(['domain','ext'])['ext'].nunique()
s = s[s.index.get_level_values(0).duplicated(keep=False)]

#and then if need aggregate sum
out = s.groupby(level=0).sum()

Pandas group by two columns and count the second column value by each group

Question

1 answers

solution1
1 ACCPTED 2022-07-06 06:52:41

Pandas group by two columns and count the second column value by each group

Question

1 answers

solution1 1 ACCPTED 2022-07-06 06:52:41

solution1
1 ACCPTED 2022-07-06 06:52:41