Display rows where any value in a particular column occurs more than once

Question

I want to display all the rows where any value in the column - "Website" occurs more than once. For example - if a certain website "xyz.com" occurs more than once, then I want to display all those rows. I am using the below code -

df[df.website.isin(df.groupby('website').website.count() > 1)]

Above code returns zero rows. But I can actually see that there are so many websites that occurs more than once by running the below code -

df.website.value_counts()

How should I modify my 1st line of code to display all such rows?

Answer 1

Use duplicated with subset='website' and keep=False :

df[df.duplicated(subset='website', keep=False)]

Sample Input:

  col1  website
0    A  abc.com
1    B  abc.com
2    C  abc.com
3    D  abc.net
4    E  xyz.com
5    F  foo.bar
6    G  xyz.com
7    H  foo.baz

Sample Output:

  col1  website
0    A  abc.com
1    B  abc.com
2    C  abc.com
4    E  xyz.com
6    G  xyz.com

Display rows where any value in a particular column occurs more than once

Question

1 answers

solution1
6 ACCPTED 2016-07-06 18:56:36

Display rows where any value in a particular column occurs more than once

Question

1 answers

solution1 6 ACCPTED 2016-07-06 18:56:36

solution1
6 ACCPTED 2016-07-06 18:56:36