[英]How do I group the data into 5min user bins and subsequently count the records?
So I have a data frame containing timestamps:所以我有一个包含时间戳的数据框:
new_date id
------------------- ----
2021-03-22 00:12:29 164616
2021-03-22 00:11:51 297284
2021-03-22 00:11:19 148817
2021-03-22 00:11:19 139208
2021-03-22 00:10:29 301459
2021-03-22 00:09:48 299543
2021-03-22 00:09:12 302444
I want to split the bins into 5 mins intervals and add together the number of ids of active users that fits withtin the bins.我想将垃圾箱分成 5 分钟的间隔,并将适合垃圾箱的活动用户的 ID 数加在一起。
new_date id
------------------- ----
2021-03-22 00:20:00 0
2021-03-22 00:15:00 13
2021-03-22 00:10:00 5
2021-03-22 00:05:00 2
so far I have tried到目前为止我已经尝试过
date["new_dates"] = pd.to_datetime(date['\tgp:last_session_date'], errors='coerce')
date = date.drop('\tgp:last_session_date', 1)
date.dropna()
df.groupby(pd.Grouper(key ="new_dates", freq = '5Min')).agg({"\tuser_id": "count"})
But it gives a weird output with different dates.....但它给出了一个奇怪的 output 与不同的日期.....
2021-02-24 18:45:00 1
2021-02-24 18:50:00 0
2021-02-24 18:55:00 0
2021-02-24 19:00:00 0
2021-02-24 19:05:00 0
I think ouput is expected, if there is some 'lost'
datetime near 2021-02-24 18:45:00
.我认为输出是预期的,如果在
2021-02-24 18:45:00
附近有一些'lost'
日期时间。
You can sorting original data for see it:您可以对原始数据进行排序以查看它:
df = df.sort_values('new_date')
So then this row is count for 1
and next values are 0
, because not exist this datetimes in data (and ouput is consecutive DatetimeIndex)因此,此行计数为
1
,下一个值为0
,因为数据中不存在此日期时间(并且输出是连续的 DatetimeIndex)
EDIT:编辑:
If need remove NaNs is necessary return back ouput for DataFrame.dropna
, else not working (or use alternative):如果需要删除 NaN,则必须返回
DataFrame.dropna
的输出,否则不起作用(或使用替代方法):
date["new_dates"] = pd.to_datetime(date['\tgp:last_session_date'], errors='coerce')
date = date.drop('\tgp:last_session_date', 1)
date = date.dropna()
#alternative
#date.dropna(inplace=True)
df = df.sort_values('new_date')
print (df)
df.groupby(pd.Grouper(key ="new_dates", freq = '5Min')).agg({"\tuser_id": "count"})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.