[英]R: count 15 minutes interval in time
I would like to count the amount of sessions started at each 15 minutes intervals for businessdays within a large dataset. 我想计算大型数据集中每个15分钟间隔开始的工作日会话数量。
My data looks like: 我的数据如下:
df <-
Start_datetime End_datetime Duration Volume
2016-04-01 06:20:55 2016-04-01 14:41:22 08:20:27 8.360
2016-04-01 08:22:27 2016-04-01 08:22:40 00:00:13 0.000
2016-04-01 08:38:53 2016-04-01 09:31:58 00:53:05 12.570
2016-04-01 09:33:57 2016-04-01 12:37:43 03:03:46 7.320
2016-04-01 10:05:03 2016-04-01 16:41:16 06:36:13 9.520
2016-04-01 12:07:57 2016-04-02 22:22:32 34:14:35 7.230
2016-04-01 16:56:55 2016-04-02 10:40:17 17:43:22 5.300
2016-04-01 17:29:18 2016-04-01 19:50:29 02:21:11 7.020
2016-04-01 17:42:39 2016-04-01 19:45:38 02:02:59 2.430
2016-04-01 17:47:57 2016-04-01 20:26:35 02:38:38 8.090
2016-04-01 22:00:15 2016-04-04 08:22:21 58:22:06 4.710
2016-04-02 01:12:38 2016-04-02 09:49:00 08:36:22 3.150
2016-04-02 01:32:00 2016-04-02 12:49:47 11:17:47 5.760
2016-04-02 07:28:48 2016-04-04 06:58:56 47:30:08 0.000
2016-04-02 07:55:18 2016-04-05 07:55:15 71:59:57 0.240
I would like to count all the starting sessions per 15minutes starting, where: 我想计算每15分钟开始的所有开始会话,其中:
For business days
Time PTU Count
00:00:00 - 00:15:00 1 10 #(where count is the amount of sessions started between 00:00:00 and 00:15:00)
00:15:00 - 00:30:00 2 6
00:30:00 - 00:45:00 3 5
00:45:00 - 01:00:00 3 3
And so on and the same data for the weekend. 等等和周末的相同数据。
I have tried the cut function: 我试过切割功能:
df$PTU <- table (cut(df$Start_datetime, breaks="15 minutes"))
data.frame(PTU)
EDIT: When I run this i receive the following error: 编辑:当我运行这个时,我收到以下错误:
Error in cut.default(df$Start_datetime, breaks = "15 minutes") :'x' must be numeric
And some functions with lubridate, but I can't seem to make it work. 和lubridate的一些功能,但我似乎无法使其工作。 My final goal is to create a table like the following, but then with 15 minutes interval. 我的最终目标是创建一个如下所示的表,但间隔为15分钟。
There's two things you have to keep in mind when using cut
on datetimes: 使用cut
日期时,您必须记住两件事:
POSIXt
class. 确保您的数据实际上是POSIXt
类。 I'm quite sure yours isn't, or R wouldn't be using cut.default
but cut.POSIXt
as a method. 我很确定你的不是,或者R不会使用cut.default
而是cut.POSIXt
作为一种方法。 "15 minutes"
should be "15 min"
. "15 minutes"
应为"15 min"
。 See ?cut.POSIXt
见?cut.POSIXt
So this works: 这样可行:
Start_datetime <- as.POSIXct(
c("2016-04-01 06:20:55",
"2016-04-01 06:22:12",
"2016-04-01 05:30:12")
)
table(cut(Start_datetime, breaks = "15 min"))
# 2016-04-01 05:30:00 2016-04-01 05:45:00 2016-04-01 06:00:00 2016-04-01 06:15:00
# 1 0 0 2
Note that the output gives you the start of the 15 minute interval as names of the table. 请注意,输出为您提供15分钟间隔的开始作为表的名称。
Here is a sort of complete process from datetime "strings" to the format you want. 这是从日期时间“字符串”到您想要的格式的一种完整过程。 The start is a string vector: 开头是一个字符串向量:
Start_time <-
c("2016-04-01 06:20:55", "2016-04-01 08:22:27", "2016-04-01 08:38:53",
"2016-04-01 09:33:57", "2016-04-01 10:05:03", "2016-04-01 12:07:57",
"2016-04-01 16:56:55", "2016-04-01 17:29:18", "2016-04-01 17:42:39",
"2016-04-01 17:47:57", "2016-04-01 22:00:15", "2016-04-02 01:12:38",
"2016-04-02 01:32:00", "2016-04-02 07:28:48", "2016-04-02 07:55:18"
)
df <- data.frame(Start_time)
And this is an actual processing 这是一个实际的处理
## We will use two packages
library(lubridate)
library(data.table)
# convert df to data.table, parse the datetime string
setDT(df)[, Start_time := ymd_hms(Start_time)]
# floor time by 15 min to assign the appropriate slot (new variable Start_time_slot)
df[, Start_time_slot := floor_date(Start_time, "15 min")]
# aggregate by wday and time in a date
start_time_data_frame <- df[, .N, by = .(wday(Start_time_slot), format(Start_time_slot, format="%H:%M:%S") )]
# output looks like this
start_time_data_frame
## wday time N
## 1: 6 06:15:00 1
## 2: 6 08:15:00 1
## 3: 6 08:30:00 1
## 4: 6 09:30:00 1
## 5: 6 10:00:00 1
## 6: 6 12:00:00 1
## 7: 6 16:45:00 1
## 8: 6 17:15:00 1
## 9: 6 17:30:00 1
## 10: 6 17:45:00 1
## 11: 6 22:00:00 1
## 12: 7 01:00:00 1
## 13: 7 01:30:00 1
## 14: 7 07:15:00 1
## 15: 7 07:45:00 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.