简体   繁体   English

每组的唯一值计数作为带有熊猫的新列

[英]Count of unique values per group as new column with pandas

I would like to count the unique observations by a group in a pandas dataframe and create a new column that has the unique count.我想计算熊猫数据框中一组的唯一观察值,并创建一个具有唯一计数的新列。 Importantly, I would not like to reduce the rows in the dataframe;重要的是,我不想减少数据框中的行; effectively performing something similar to a window function in SQL.有效地执行类似于 SQL 中的窗口函数的操作。

df = pd.DataFrame({
         'uID': ['James', 'Henry', 'Abe', 'James', 'Henry', 'Brian', 'Claude', 'James'],
         'mID': ['A', 'B', 'A', 'B', 'A', 'A', 'A', 'C']
})

df.groupby('mID')['uID'].nunique()

Will get the unique count per group, but it summarises (reduces the rows), I would effectively like to do something along the lines of:将获得每组的唯一计数,但它总结(减少行),我实际上想按照以下方式做一些事情:

df['ncount'] = df.groupby('mID')['uID'].transform('nunique')

(this obviously does not work) (这显然不起作用)

It is possible to accomplish the desired outcome by taking the unique summarised dataframe and joining it to the original dataframe but I am wondering if there is a more minimal solution.通过获取独特的汇总数据框并将其加入原始数据框,可以实现预期的结果,但我想知道是否有更简单的解决方案。

Thanks谢谢

GroupBy.transform('nunique')

On v0.23.4 , your solution works for me.v0.23.4 ,您的解决方案对我有用。

df['ncount'] = df.groupby('mID')['uID'].transform('nunique')
df
      uID mID  ncount
0   James   A       5
1   Henry   B       2
2     Abe   A       5
3   James   B       2
4   Henry   A       5
5   Brian   A       5
6  Claude   A       5
7   James   C       1

GroupBy.nunique + pd.Series.map GroupBy.nunique + pd.Series.map

Additionally, with your existing solution, you could map the series back to mID :此外,使用您现有的解决方案,您可以将系列mapmID

df['ncount'] = df.mID.map(df.groupby('mID')['uID'].nunique())
df
      uID mID  ncount
0   James   A       5
1   Henry   B       2
2     Abe   A       5
3   James   B       2
4   Henry   A       5
5   Brian   A       5
6  Claude   A       5
7   James   C       1

You are very close!你很亲近!

df['ncount'] = df.groupby('mID')['uID'].transform(pd.Series.nunique)

      uID mID  ncount
0   James   A       5
1   Henry   B       2
2     Abe   A       5
3   James   B       2
4   Henry   A       5
5   Brian   A       5
6  Claude   A       5
7   James   C       1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM