简体   繁体   English

过滤,分组并计算熊猫?

[英]Filter , group by and count in pandas?

A TSV file contains some user event data : TSV文件包含一些用户事件数据:

user_uid category event_type
"11"      "like"   "post"
"33"      "share"  "status"
"11"      "like"   "post"
"42"      "share"  "post"

what is the best way to get the number of post events for each category and for each user_id? 获取每个类别和每个user_id的post事件数量的最佳方法是什么?

we should show the following output: 我们应该显示以下输出:

user_uid category count
"11"     "like"    2
"42"     "share"   1

Clean up any trailing whitespace so that things group properly. 清理任何尾随空格,以便正确分组。 Filter your DataFrame , and then apply groupby + size 过滤您的DataFrame ,然后应用groupby + size

df['category'] = df.category.str.strip()
df['user_uid'] = df.user_uid.str.strip()
df[df.event_type == 'post'].groupby(['user_uid', 'category']).size()

Output: 输出:

user_uid  category
11        like        2
42        share       1
dtype: int64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM