[英]Pandas groupby two columns and get unique count
I have the following dataframe:我有以下数据框:
ID hour
3403 9
3478 1
3478 1
3478 1
3478 1
3478 1
3478 1
3481 1
3489 1
3489 1
3489 1
3489 1
3489 1
3489 1
3489 1
3502 2
3502 2
3502 2
I want to get the unique count of ID's against each hours.我想获得每个小时的唯一 ID 计数。 Meaning, I want something like this:
意思是,我想要这样的东西:
count hour
1 9
3 1
1 2
How can I do this?我怎样才能做到这一点?
All I have done so far is groupby both hour and ID, like this:到目前为止,我所做的只是对小时和 ID 进行分组,如下所示:
df.groupby(['hour', 'CONVERSATIONID'])
But doesnt know how to proceed further.但不知道如何进一步。
#input data
d = {'ID': [3403,3478,3478,3481,3502,3502], 'Hour': [9,1,1,1,2,2]}
df = pd.DataFrame(data=d)
#drop duplicates in ID column
df = df.drop_duplicates(subset=None, keep='first', inplace=False)
#group by Hour
df = df[['Hour', 'ID']].groupby(['Hour']).agg(['count'])
您可以简单地使用 group by 然后进行计数
df.groupby(['Hour','ID']).size().reset_index().groupby('Hour').Hour.value_counts()
这可能有效-
df.groupby(['hour']).agg(count=('ID', 'nunique')).reset_index()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.