简体   繁体   中英

Drop rows of a pandas dataFrame with unique elements in a given column. (by unique I mean repeated once)

Let's say I have the following dataFrame and I want to drop the rows containing 10, and 100, ie the elements that have appeared only once in col1.

数据帧

I can do the following:

a = df.groupby('col1').size()
b = list(a[a == 1].index)

and then have a for loop and drop the rows one by one:

d_ind = df[df['col1']==b[0]].index
df.drop(d_ind, axis=0, inplace=True)

Is there any faster, more efficient way?

You can use the duplicated method on col1 , which can detect whether an element has duplicates with keep=False parameter and returns a boolean Series which you can use to subset/filter/drop rows:

df[df.col1.duplicated(keep=False)]

#   col1  col2  months
#0     1     3       6
#1     1     4       6
#4     4    20       6
#5     4    11       7
#6     4    12       7

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM