简体   繁体   English

熊猫分组两列并获得唯一计数

[英]Pandas groupby two columns and get unique count

I have the following dataframe:我有以下数据框:

   ID       hour                          
  3403       9
  3478       1
  3478       1
  3478       1
  3478       1
  3478       1
  3478       1
  3481       1
  3489       1
  3489       1
  3489       1
  3489       1
  3489       1
  3489       1
  3489       1
  3502       2
  3502       2
  3502       2

I want to get the unique count of ID's against each hours.我想获得每个小时的唯一 ID 计数。 Meaning, I want something like this:意思是,我想要这样的东西:

count     hour
  1        9
  3        1
  1        2 

How can I do this?我怎样才能做到这一点?
All I have done so far is groupby both hour and ID, like this:到目前为止,我所做的只是对小时和 ID 进行分组,如下所示:

df.groupby(['hour', 'CONVERSATIONID'])

But doesnt know how to proceed further.但不知道如何进一步。

#input data
d = {'ID': [3403,3478,3478,3481,3502,3502], 'Hour': [9,1,1,1,2,2]}
df = pd.DataFrame(data=d)
#drop duplicates in ID column
df = df.drop_duplicates(subset=None, keep='first', inplace=False)
#group by Hour
df = df[['Hour', 'ID']].groupby(['Hour']).agg(['count'])

您可以简单地使用 group by 然后进行计数

df.groupby(['Hour','ID']).size().reset_index().groupby('Hour').Hour.value_counts()

这可能有效-

df.groupby(['hour']).agg(count=('ID', 'nunique')).reset_index()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM