简体   繁体   English

计算数据框熊猫中唯一行的数量

[英]Count number of unique row in dataframe pandas

I need to count the number of unique row in a dataframe pandas.我需要计算数据框熊猫中唯一行的数量。 I try this solution: pandas - number of unique rows occurrences in dataframe but it generates a error.我尝试了这个解决方案: pandas - dataframe 中出现的唯一行数,但它会产生错误。

This is the code that I try:这是我尝试的代码:

import pandas as pd

df = {'x1': ['A','B','A','A','B','A','A','A'], 'x2': [1,3,2,2,3,1,2,3]}
df = pd.DataFrame(df)

print df.groupby(['x1','x2'], as_index=False).count()

This is the error:这是错误:

Traceback (most recent call last):
  File "/home/user/workspace/project/test.py", line 9, in <module>
    print df.groupby(['x1','x2'], as_index=False).count()
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 4372, in count
    return self._wrap_agged_blocks(data.items, list(blk))
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 4274, in _wrap_agged_blocks
    index = np.arange(blocks[0].values.shape[1])
IndexError: list index out of range

what am I doing wrong?我究竟做错了什么?

Do it by using size (ps: you can add .reset_index() at the end) 通过使用size .reset_index() ps:你可以在最后添加.reset_index()

df.groupby(['x1','x2'], as_index=False).size()
Out[1262]: 
x1  x2
A   1     2
    2     3
    3     1
B   3     2
dtype: int64

Or fix your code 或者修复你的代码

df.groupby(['x1','x2'])['x2'].count()
Out[1264]: 
x1  x2
A   1     2
    2     3
    3     1
B   3     2
Name: x2, dtype: int64

If you want to know the unique groups, you can using ngroups 如果您想知道唯一的组,可以使用ngroups

df.groupby(['x1','x2']).ngroups
Out[1267]: 4

You could drop duplicates: 您可以删除重复项:

import pandas as pd

df = {'x1': ['A','B','A','A','B','A','A','A'], 'x2': [1,3,2,2,3,1,2,3]}
df = pd.DataFrame(df)

print(len(df.drop_duplicates()))

Returns 返回

4

To count the number of occurences of unique rows in the dataframe, instead of using count , you should use value_counts now.要计算数据框中唯一行的出现次数,您现在应该使用value_counts而不是使用count

df.groupby(['x1','x2'], as_index=False).value_counts()
Out[417]: 
  x1  x2  count
0  A   1      2
1  A   2      3
2  A   3      1
3  B   3      2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM