简体   繁体   中英

How to remove rows with duplicates in pandas dataframe?

Having a dataframe which contains duplicate values in two columns ( A and B ):

A B
1 2
2 3
4 5
7 6
5 8

I want to remove duplicates so that only unique values remain:

A B
1 2
4 5
7 6

This command does not provide what I want:

df.drop_duplicates(subset=['A','B'], keep='first')

Any idea how to do this?

You can use stack with unstack :

print (df.stack().drop_duplicates().unstack().dropna().astype(int))
   A  B
0  1  2
2  4  5
3  7  6

Solution with boolean indexing :

print (df[~df.stack().duplicated().unstack().any(1)])
   A  B
0  1  2
2  4  5
3  7  6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM