[英]Rolling count of unique categorical features by groups
I would like to get a rolling count of unique categoricals by group: 我想按组对唯一类别进行滚动计数:
Group Item
A pen
A pen
A elbow
A warthog
B elbow
B peach
Should result in: 应导致:
Group Item Unique_item_count
A pen 1
A pen 1
A elbow 2
A warthog 3
B elbow 1
B peach 2
I feel like pd.rolling_count might have the answer, but I haven't figured it out. 我觉得pd.rolling_count可能有答案,但我还没有弄清楚。 Thanks for your wisdom and wizardry! 感谢您的智慧和巫术!
We can GroupBy
twice. 我们可以GroupBy
两次。 First we get the nunique
values back and second time we get cumsum
to make our count go up for each unique value in Item
: 首先,我们获得nunique
值,第二次获得cumsum
以使Item
每个唯一值的计数增加:
Then we merge
these results back to our original dataframe. 然后,我们merge
这些结果merge
回我们的原始数据框。
s = df.groupby(['Group', 'Item'], sort=False)['Item'].nunique().groupby(level=0).cumsum()
final = df.merge(s.reset_index(name='Unique_item_count'), on=['Group', 'Item'])
Output 产量
Group Item Unique_item_count
0 A pen 1
1 A pen 1
2 A elbow 2
3 A warthog 3
4 B elbow 1
5 B peach 2
Approach is same as suggested by Erfan. 方法与二凡建议的相同。 Just don't have to do merge
只是不必merge
df.groupby(['Group', 'Item'], sort=False)['Item'].nunique().groupby(level=0).cumsum().reindex(df).reset_index(name='Unique_count')
Output 产量
Group Item Unique_count
0 A pen 1
1 A pen 1
2 A elbow 2
3 A warthog 3
4 B elbow 1
5 B peach 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.