Pandas Counting Unique Rows

Question

I have a pandas data frame similar to:

I want an output that has the same function as Counter . I need to know how many time each row appears (with all of the columns being the same.

In this case the proper output would be:

ColA ColB Count
1    1    3
1    2    2
2    1    1
3    2    1

I have tried something of the sort:

df.groupby(['ColA','ColB']).ColA.count()

but this gives me some ugly output I am having trouble formatting

Answer 1

You can use size with reset_index :

print df.groupby(['ColA','ColB']).size().reset_index(name='Count')
   ColA  ColB  Count
0     1     1      3
1     1     2      2
2     2     1      1
3     3     2      1

Answer 2

I only needed to count the unique rows and have used the DataFrame.drop_duplicates alternative as below:

len(df[['ColA', 'ColB']].drop_duplicates())

It was twice as fast on my data than len(df.groupby(['ColA', 'ColB'])) .

Answer 3

Since Pandas 1.1.0 the method pandas.DataFrame.value_counts is available, which does exactly, what you need. It creates a Series with the unique rows as multi-index and the counts as values:

df = pd.DataFrame({'ColA': [1, 1, 1, 1, 1, 2, 3], 'ColB': [1, 1, 1, 2, 2, 1, 2]})
pd.options.display.multi_sparse = False  # option to print as requested

print(df.value_counts())                 # requires pandas >= 1.1.0

Output, where ColA and ColB are the multi-index and the third column contains the counts:

ColA  ColB
1     1       3
1     2       2
3     2       1
2     1       1
dtype: int64

Pandas Counting Unique Rows

Question

3 answers

solution1
20 ACCPTED 2016-03-15 18:16:31

solution2
14 2019-09-24 19:17:47

solution3
2 2020-12-31 15:51:17

Pandas Counting Unique Rows

Question

3 answers

solution1 20 ACCPTED 2016-03-15 18:16:31

solution2 14 2019-09-24 19:17:47

solution3 2 2020-12-31 15:51:17

solution1
20 ACCPTED 2016-03-15 18:16:31

solution2
14 2019-09-24 19:17:47

solution3
2 2020-12-31 15:51:17