[英]PANDAS groupby 2 columns then count and mean
I have a data frame of users and each time they entered a website, it looks like this:我有一个用户数据框,每次他们进入一个网站时,它看起来像这样:
(if there are x row with same week and date it means the user entered the site x time that date). (如果有 x 行具有相同的星期和日期,则表示用户在该日期的 x 时间进入了站点)。
ID ![]() |
week![]() |
date![]() |
---|---|---|
1 ![]() |
2 ![]() |
20/07/21 ![]() |
2 ![]() |
3 ![]() |
23/07/21 ![]() |
2 ![]() |
3 ![]() |
23/07/21 ![]() |
2 ![]() |
3 ![]() |
26/07/21 ![]() |
2 ![]() |
4 ![]() |
30/07/21 ![]() |
2 ![]() |
4 ![]() |
30/07/21 ![]() |
2 ![]() |
4 ![]() |
30/07/21 ![]() |
2 ![]() |
4 ![]() |
31/07/21 ![]() |
so far I've managed to do this:到目前为止,我已经设法做到了这一点:
ID ![]() |
week![]() |
date![]() |
days number![]() |
---|---|---|---|
1 ![]() |
2 ![]() |
20/07/21 ![]() |
1 ![]() |
2 ![]() |
3 ![]() |
23/07/21 ![]() |
2 ![]() |
2 ![]() |
3 ![]() |
26/07/21 ![]() |
1 ![]() |
2 ![]() |
4 ![]() |
30/07/21 ![]() |
3 ![]() |
2 ![]() |
4 ![]() |
31/07/21 ![]() |
1 ![]() |
using this code:使用此代码:
df.groupby(['ID','week','date']).agg({'date':['count']})
but I need to calculate the mean times each user used the site by week, so each user has a row for each week.但我需要计算每个用户每周使用该网站的平均时间,因此每个用户每周都有一行。 Therefor the output I need looks like this:
因此,我需要的 output 如下所示:
ID ![]() |
week![]() |
mean days number![]() |
---|---|---|
1 ![]() |
2 ![]() |
1 ![]() |
2 ![]() |
3 ![]() |
1.5 ![]() |
2 ![]() |
4 ![]() |
2 ![]() |
Any ideas how to continue?任何想法如何继续?
Thanks!!谢谢!!
Use:使用:
(df.groupby(['ID', 'week', 'date'], as_index=False)['date']
.agg('count')
.groupby(['ID', 'week'], as_index=False)
.agg(**{'mean days number': ('date', 'mean')})
)
Output: Output:
ID week mean days number
0 1 2 1.0
1 2 3 1.5
2 2 4 2.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.