简体   繁体   English

如何计算条件匹配之前的时间段

[英]How to calculate a time period until a condition is matched

I need to calculate a time of consecutive dates, until the difference of time between two consecutive dates is greater than 13 seconds.我需要计算连续日期的时间,直到两个连续日期之间的时间差大于 13 秒。

For example, in the data frame create with the code shown below, the column test has the time difference between the dates.例如,在使用下面显示的代码创建的数据框中,列测试具有日期之间的时间差。 What I need is events of time between lines with test > 13 seconds.我需要的是测试 > 13 秒的行之间的时间事件。

# Create a vector of dates with a random time difference in seconds between records
dates <- seq(as.POSIXct("2020-01-01 00:00:02"), as.POSIXct("2020-01-02 00:00:02"), by = "2 sec")
dates <- dates + sample(15, length(dates), replace = T)

# Create a data.frame
data <- data.frame(id = 1:length(dates), dates = dates)

# Create a test field with the time difference between each date and the next
data$test <- c(diff(data$dates, lag = 1), 0)

# Delete the zero and negative time
data <- data[data$test > 0, ]

head(data)

What I want is something like this:我想要的是这样的:

在此处输入图片说明

To get to your desired result we need to define 'blocks' of observation.为了获得您想要的结果,我们需要定义观察的“块”。 Each block is splitted where test is greater than 13.每个块在test大于 13 的地方被分割。
We start identifying the split_point , and then using the rle function we can assign an ID to each block.我们开始识别split_point ,然后使用rle函数,我们可以指定一个ID到每个块。 Then we can filter out the split_point , and summarize the remaining blocks.然后我们可以过滤掉split_point ,并总结剩余的块。 Once with the sum of seconds, then with the min of the event dates.一次是秒的总和,然后是事件日期的最小值。

split_point <- data$test <=13
# Find continuous blocks
block_str <- rle(split_point)
# Create block IDs
data$block <- rep(seq_along(block_str$lengths), block_str$lengths)
data <- data[split_point, ] # Remove split points

# Summarize
final_df <- aggregate(test ~ block, data = data, FUN = sum)
dtevent <- aggregate(dates ~ block, data= data, FUN=min)

# Join the two summaries
final_df$DatetimeEvent <- dtevent$dates

head(final_df)
#>   block test       DatetimeEvent
#> 1     1 101  2020-01-01 00:00:09
#> 2     3 105  2020-01-01 00:01:11
#> 3     5 277  2020-01-01 00:02:26
#> 4     7  46  2020-01-01 00:04:58
#> 5     9  27  2020-01-01 00:05:30
#> 6    11 194  2020-01-01 00:05:44

Created on 2020-04-02 by the reprex package (v0.3.0)reprex 包(v0.3.0) 于 2020 年 4 月 2 日创建

Using dplyr for convenience sake:为方便起见,使用dplyr

library(dplyr)

final_df <- data %>%
  mutate(split_point = test <= 13,
         block = with(rle(split_point), rep(seq_along(lengths), lengths))) %>%
  group_by(block) %>%
  filter(split_point) %>%
  summarise(DateTimeEvent = min(dates), TotalTime = sum(test))

final_df
#> # A tibble: 1,110 x 3
#>    block DateTimeEvent       TotalTime
#>    <int> <dttm>              <drtn>   
#>  1     1 2020-01-01 00:00:06 260 secs 
#>  2     3 2020-01-01 00:02:28 170 secs 
#>  3     5 2020-01-01 00:04:11 528 secs 
#>  4     7 2020-01-01 00:09:07  89 secs 
#>  5     9 2020-01-01 00:10:07  37 secs 
#>  6    11 2020-01-01 00:10:39 135 secs 
#>  7    13 2020-01-01 00:11:56  50 secs 
#>  8    15 2020-01-01 00:12:32 124 secs 
#>  9    17 2020-01-01 00:13:52  98 secs 
#> 10    19 2020-01-01 00:14:47  83 secs 
#> # … with 1,100 more rows

Created on 2020-04-02 by the reprex package (v0.3.0)reprex 包(v0.3.0) 于 2020 年 4 月 2 日创建

(results are different because reprex recreates the data each time) (结果不同,因为reprex每次都重新创建数据)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM