计算熊猫组中的唯一值

Question

I have a dataframe like this:我有一个这样的数据框：

data = {'id': [1,1,1,2,2,3],
        'value': ['a','a','a','b','b','c'],
        'obj_id': [1,2,3,3,3,4]
}
df = pd.DataFrame (data, columns = ['id','value','obj_id'])

I would like to get the unique counts of obj_id groupby id and value :我想获得obj_id groupby id和value的唯一计数：

1 a 3
2 b 1
3 c 1

But when I do:但是当我这样做时：

result=df.groupby(['id','value'])['obj_id'].nunique().reset_index(name='obj_counts')

the result I got was:我得到的结果是：

so the first two rows with same id and value don't group together.所以具有相同id和value的前两行不会组合在一起。

How can I fix this?我怎样才能解决这个问题？ Many thanks!非常感谢！

Answer 1

For me your solution working nice with sample data.对我来说，您的解决方案适用于示例数据。

Like mentioned @YOBEN_S in comments is possible problem traling whitespeces, then solution is add Series.str.strip :就像在评论中提到的@YOBEN_S 可能是跟踪 whitespeces 的问题，然后解决方案是添加Series.str.strip ：

data = {'id': [1,1,1,2,2,3],
        'value': ['a ','a','a','b','b','c'],
        'obj_id': [1,2,3,3,3,4]
}
df = pd.DataFrame (data, columns = ['id','value','obj_id'])

df['value'] = df['value'].str.strip()
df = df.groupby(['id','value'])['obj_id'].nunique().reset_index(name='obj_counts')
print (df)
   id value  obj_counts
0   1     a           3
1   2     b           1
2   3     c           1

计算熊猫组中的唯一值

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-02-20 14:24:46

计算熊猫组中的唯一值

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-02-20 14:24:46

解决方案1
1 已采纳 2020-02-20 14:24:46