简体   繁体   中英

Pandas group by and fraction where two columns equal

I want to get the fraction of rows where two columns are equal, where I group by another column.

In the example below, I want to group by col1, and compare col2 == col3.

Input:

col1 | col2 | col3
A    |  c   |  c
A    |  d   |  g
B    |  c   |  c
B    |  d   |  d

Desired output:

A | 0.5 #its 50 % because c==c, d != g
B | 1.0 #its 100% because c==c, d==d

Is there a way to do this in pandas in one command?

groupby + mean :

df['col2'].eq(df['col3']).groupby(df['col1']).mean()

col1
A    0.5
B    1.0
dtype: float64

You can try group by col1 then sum the rows that col2 is equal with col3 and divide by the group length.

out = df.groupby('col1').apply(lambda g: g['col2'].eq(g['col3']).sum()/len(g))
print(out)

col1
A    0.5
B    1.0
dtype: float64

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM