如何在pandas中按每组两列计算唯一记录？

Question

Same as How to count unique records by two columns in pandas? 与如何通过pandas中的两列计算唯一记录相同？ , only per group. ，仅限每组。 I tried: 我试过了：

df = pd.DataFrame({'a': [1,1,1,2,2], 'b':[10,10,20,30,30], 'c':[5,7,7,11,17]})
df.groupby('a').groupby(['b', 'c']).ngroups

And it throws AttributeError . 它会抛出AttributeError 。

Answer 1

You don't need the double groupby: Use drop_duplicates with ['b', 'c'] as your subset, to keep only unique rows, then groupby 'a' and use size : 你不需要double groupby：使用带有['b', 'c'] drop_duplicates作为你的子集，只保留唯一的行，然后groupby'a 'a'并使用size ：

df.drop_duplicates(['b', 'c']).groupby('a').size()

a
1    3
2    2
dtype: int64

Answer 2

You need to apply a function to the results of first groupping: 您需要将函数应用于第一次灌浆的结果：

df.groupby('a').apply(lambda x: x.groupby(['b', 'c']).ngroups)
#a
#1    3
#2    2

如何在pandas中按每组两列计算唯一记录？

问题描述

2 个解决方案

解决方案1
6 2018-08-05 07:20:27

解决方案2
3 已采纳 2018-08-05 07:16:42

如何在pandas中按每组两列计算唯一记录？

问题描述

2 个解决方案

解决方案1 6 2018-08-05 07:20:27

解决方案2 3 已采纳 2018-08-05 07:16:42

解决方案1
6 2018-08-05 07:20:27

解决方案2
3 已采纳 2018-08-05 07:16:42