简体   繁体   中英

Calculate time difference between two events (given date and time) in R

I'm currently struggling with a beginner's issue regarding the calculation of a time difference between two events.

I want to take a column consisting of date and time (both values in one column) into consideration and calculate a time difference with the value of the previous/next row with the same ID (A or B in this example).

ID = c("A", "A", "B", "B")
time = c("08.09.2014 10:34","12.09.2014 09:33","13.08.2014 15:52","11.09.2014 02:30")
d = data.frame(ID,time)

My desired output is in the format Hours:Minutes

time difference = c("94:59","94:59","682:38","682:38")

The format Days:Hours:Minutes or anything similar would also work, as long as it could be conveniently implemented. I am flexible regarding the format of the output, the above is just an idea that crossed my mind.

For each single ID, I always have two rows (in the example 2xA and 2xB). I don't have a convincing idea how to avoid the repition of the difference.

I've tried some examples before, which I found on stackoverflow. Most of them used POSIXt and strptime . However, I didn't manage to apply those ideas to my data set.

Here's my attempt using dplyr

library(dplyr)
d %>%
  mutate(time = as.POSIXct(time, format = "%d.%m.%Y %H:%M")) %>%
  group_by(ID) %>%
  mutate(diff = paste0(gsub("[.].*", "", diff(time)*24), ":",
                       round(as.numeric(gsub(".*[.]", ".", diff(time)*24))*60)))


# Source: local data frame [4 x 3]
# Groups: ID
# 
#   ID                time   diff
# 1  A 2014-09-08 10:34:00  94:59
# 2  A 2014-09-12 09:33:00  94:59
# 3  B 2014-08-13 15:52:00 682:38
# 4  B 2014-09-11 02:30:00 682:38

A very (to me) hack-ish base solution:

ID <- c("A", "A", "B", "B")
time <- c("08.09.2014 10:34", "12.09.2014 09:33", "13.08.2014 15:52","11.09.2014 02:30")
d <- data.frame(ID, time)

d$time <- as.POSIXct(d$time, format="%d.%m.%Y %H:%M")

unlist(unname(lapply(split(d, d$ID), function(d) {
  sapply(abs(diff(c(d$time[2], d$time))), function(x) {
    sprintf("%s:%s", round(((x*24)%/%1)), round(((x*24)%%1 *60)))
  }) 
})))

## [1] "94:59"  "94:59"  "682:38" "682:38"

I have to believe this function exists somewhere already, tho.

similar to the attempts of David and hrmbrmstr, I found that this solution using difftime works

I use a rowshift script I found on stackoverflow

rowShift <- function(x, shiftLen = 1L) {
   r <- (1L + shiftLen):(length(x) + shiftLen)
   r[r<1] <- NA
   return(x[r])
}
d$time.c <- as.POSIXct(d$time, format = "%d.%m.%Y %H:%M")
d$time.prev <- rowShift(d$time.c,-1)
d$diff <- difftime(d$time.c,d$time.prev, units="hours")

Every other row of d$diff has positive/negative values in the results. I do remove all the rows with negative values and have the difference between the first and the last time for every ID.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM