I have a DF like this
User Dept
1 Cook
1 Cook
1 Home
2 Sports
2 Travel
2 Cook
I want to count the unique users within each department:
Dept User
Cook 2
Home 1
Sports 1
Travel 1
Notice how the department Cook only has a count of two because even though three users were found in 'Cook', there were only two unique users
I have tried the following:
df.groupby(['Dept']).count() -- counts 'Cook' three times
df.drop_duplicates(['Dept']).groupby('Dept')['User'].sum() -- over counts all departments
I know the answer is a groupby, I just can't seem to figure it out!
You could use nunique
:
>>> df.groupby("Dept")["User"].nunique()
Dept
Cook 2
Home 1
Sports 1
Travel 1
Name: User, dtype: int64
>>> df.groupby("Dept")["User"].nunique().reset_index()
Dept User
0 Cook 2
1 Home 1
2 Sports 1
3 Travel 1
(Note that I used your example data, which only has one unique user in Sports.)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.