简体   繁体   English

按组滚动的独特分类特征

[英]Rolling count of unique categorical features by groups

I would like to get a rolling count of unique categoricals by group: 我想按组对唯一类别进行滚动计数:

Group  Item
A      pen
A      pen
A      elbow
A      warthog
B      elbow
B      peach

Should result in: 应导致:

Group  Item     Unique_item_count
A      pen      1
A      pen      1
A      elbow    2
A      warthog  3
B      elbow    1
B      peach    2

I feel like pd.rolling_count might have the answer, but I haven't figured it out. 我觉得pd.rolling_count可能有答案,但我还没有弄清楚。 Thanks for your wisdom and wizardry! 感谢您的智慧和巫术!

We can GroupBy twice. 我们可以GroupBy两次。 First we get the nunique values back and second time we get cumsum to make our count go up for each unique value in Item : 首先,我们获得nunique值,第二次获得cumsum以使Item每个唯一值的计数增加:

Then we merge these results back to our original dataframe. 然后,我们merge这些结果merge回我们的原始数据框。

s = df.groupby(['Group', 'Item'], sort=False)['Item'].nunique().groupby(level=0).cumsum()

final = df.merge(s.reset_index(name='Unique_item_count'), on=['Group', 'Item'])

Output 产量

  Group     Item  Unique_item_count
0     A      pen                  1
1     A      pen                  1
2     A    elbow                  2
3     A  warthog                  3
4     B    elbow                  1
5     B    peach                  2

Approach is same as suggested by Erfan. 方法与二凡建议的相同。 Just don't have to do merge 只是不必merge

df.groupby(['Group', 'Item'], sort=False)['Item'].nunique().groupby(level=0).cumsum().reindex(df).reset_index(name='Unique_count')

Output 产量

  Group     Item    Unique_count
0   A       pen         1
1   A       pen         1
2   A       elbow       2
3   A       warthog     3
4   B       elbow       1
5   B       peach       2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM