Assume DF 1:
A B C
0 1 1 1
1 1 1 2
2 2 1 1
3 1 9 0
4 9 9 9
And DF 2
A B C
0 6 1 1
1 1 1 2
2 2 1 1
3 1 9 0
4 1 9 6
I would like to add a column to DF 1 with a count of duplicates in DF 2 based on a subset of columns:
For example
Duplicate on
Result:
A B C Dupe
0 1 1 1 1
1 1 1 2 1
2 2 1 1 1
3 1 9 0 2
4 9 9 9 0
Sound like you should groupby
by df2 then merge
df=df1.merge(df2.groupby(['A','B']).size().to_frame('DUP').reset_index(),how='left').fillna(0)
A B C DUP
0 1 1 1 1.0
1 1 1 2 1.0
2 2 1 1 1.0
3 1 9 0 2.0
4 9 9 9 0.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.