简体   繁体   中英

Display rows where any value in a particular column occurs more than once

I want to display all the rows where any value in the column - "Website" occurs more than once. For example - if a certain website "xyz.com" occurs more than once, then I want to display all those rows. I am using the below code -

df[df.website.isin(df.groupby('website').website.count() > 1)]

Above code returns zero rows. But I can actually see that there are so many websites that occurs more than once by running the below code -

df.website.value_counts()

How should I modify my 1st line of code to display all such rows?

Use duplicated with subset='website' and keep=False :

df[df.duplicated(subset='website', keep=False)]

Sample Input:

  col1  website
0    A  abc.com
1    B  abc.com
2    C  abc.com
3    D  abc.net
4    E  xyz.com
5    F  foo.bar
6    G  xyz.com
7    H  foo.baz 

Sample Output:

  col1  website
0    A  abc.com
1    B  abc.com
2    C  abc.com
4    E  xyz.com
6    G  xyz.com

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM