Trying to make a dataframe with the dropped duplicates from pandas drop_duplicates on a certain field

Question

I have a dataframe which is supposed to have a unique field. In the data I am given the field is not unique and so I have been using drop_duplicates to get rid of those. However, I would like to see what rows I am dropping for QC. I've been reading threads on this but I've only seen ones that look at entire duplicate rows (not just one field that is duplicated), or they compare dataframes that don't have duplicates within themselves. How can I get a dataframe of the rows that are removed in my code below? Thank you

   df= df.drop_duplicates(subset='_nefin_tree_obsID', keep=False)

Answer 1

refer to documentation duplicated

this should help

df.duplicated(subset='_nefin_tree_obsID' )

Trying to make a dataframe with the dropped duplicates from pandas drop_duplicates on a certain field

Question

1 answers

solution1
0 ACCPTED 2022-08-01 18:03:02

Trying to make a dataframe with the dropped duplicates from pandas drop_duplicates on a certain field

Question

1 answers

solution1 0 ACCPTED 2022-08-01 18:03:02

solution1
0 ACCPTED 2022-08-01 18:03:02