简体   繁体   中英

Trying to make a dataframe with the dropped duplicates from pandas drop_duplicates on a certain field

I have a dataframe which is supposed to have a unique field. In the data I am given the field is not unique and so I have been using drop_duplicates to get rid of those. However, I would like to see what rows I am dropping for QC. I've been reading threads on this but I've only seen ones that look at entire duplicate rows (not just one field that is duplicated), or they compare dataframes that don't have duplicates within themselves. How can I get a dataframe of the rows that are removed in my code below? Thank you

   df= df.drop_duplicates(subset='_nefin_tree_obsID', keep=False)

refer to documentation duplicated

this should help

df.duplicated(subset='_nefin_tree_obsID' )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM