简体   繁体   English

有条件地计算R中行之间的时间差

[英]Conditionally calculate time differences between rows in R

I'm trying to calculate the time difference between a row and a row that has a column that meets some criteria. 我正在尝试计算行与具有满足某些条件的列的行之间的时间差。

Reading in some data: 读取一些数据:

my_data <- data.frame(criteria = c("some text", "some more text", " ", " ", "more text", " "),
                  timestamp = as.POSIXct(c("2015-07-30 15:53:15", "2015-07-30 15:53:47", "2015-07-30 15:54:48", "2015-07-30 15:55:48", "2015-07-30 15:56:48", "2015-07-30 15:57:49")))

        criteria           timestamp
1      some text 2015-07-30 15:53:15
2 some more text 2015-07-30 15:53:47
3                2015-07-30 15:54:48
4                2015-07-30 15:55:48
5      more text 2015-07-30 15:56:48
6                2015-07-30 15:57:49

I want to get the time difference (in minutes) between every row and the last row that wasn't blank in the criteria column. 我想获得标准列中每行和最后一行之间不存在空白的时间差(以分钟为单位)。 Therefore, I want: 因此,我想要:

        criteria           timestamp time_diff
1      some text 2015-07-30 15:53:15         0
2 some more text 2015-07-30 15:53:47         0
3                2015-07-30 15:54:48         1
4                2015-07-30 15:55:48         2
5      more text 2015-07-30 15:56:48         0
6                2015-07-30 15:57:49         1

So far, I've built the code to recognize where the "0's" should be - I just need the code to fill in the time differences. 到目前为止,我已经构建了识别“ 0”应该在哪里的代码-我只需要代码来填写时间差即可。 Here's my code: 这是我的代码:

my_data$time_diff <- ifelse (my_data$criteria != "", # Here's our statement
  my_data$time_diff <- "0", # Here's what happens if statement is TRUE
  my_data$time_diff <- NEED CODE HERE # if statement FALSE
  )

I have a feeling that this job may be better performed by something that isn't an ifelse statement, but i'm relatively new to R. 我觉得用非ifelse语句可以更好地完成这项工作,但是我对R还是比较ifelse

I've found q's on here where individuals tried to get time differences between neighboring rows (eg here and here ), but have yet to find someone trying to deal with this kind of situation. 我在这里找到q,那里的人试图获取相邻行之间的时差(例如, herehere ),但尚未找到有人尝试解决这种情况。

The closest question I've found to mine is this one , but that data are different from mine in how the individual wants to process them (at least from my vantage point). 我发现的最接近的问题是这个问题 ,但是数据与我的不同之处在于个人希望如何处理它们(至少从我的角度出发)。

edit: capitalized title. 编辑:大写的标题。

Completing the answer with alexis_laz's masterful expression: 用alexis_laz的出色表达来完成答案:

my_data <- data.frame(criteria = c("some text", "some more text", " ", " ", "more text", " "),
                      timestamp = as.POSIXct(c("2015-07-30 15:53:15", "2015-07-30 15:53:47", "2015-07-30 15:54:48", "2015-07-30 15:55:48", "2015-07-30 15:56:48", "2015-07-30 15:57:49")))

my_data$time_diff <- 
  my_data$timestamp - 
  my_data[cummax((my_data$criteria != " ") * seq_len(nrow(my_data))), 'timestamp']

my_data

        criteria           timestamp time_diff
1      some text 2015-07-30 15:53:15    0 secs
2 some more text 2015-07-30 15:53:47    0 secs
3                2015-07-30 15:54:48   61 secs
4                2015-07-30 15:55:48  121 secs
5      more text 2015-07-30 15:56:48    0 secs
6                2015-07-30 15:57:49   61 secs

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM