简体   繁体   中英

Is there an R function to compare timestamps in two different datasets?

I am new to R and have just started my research work, so please excuse if the answer is obvious. I have tried to find the answer in other questions, but I am not sure if I am using the right terms. Including this similar, but not identical question ( R Stats: Comparing timestamps in two dataframes ).

For my research question we wanted to measure episodes of heart arrhythmia (atrial fibrillation=afib) in patients. We did this using two different methods: ECG and PPG .

Therefore we have two different dataframes per patient.

ECG:

start               | end                   | type
19.10.2020 11:34:53 | 19.10.2020 11:35:24   | noise   
19.10.2020 22:49:53 | 19.10.2020 22:59:53   | Afib
19.10.2020 23:00:21 | 19.10.2020 23:10:53   | Afib
19.10.2020 23:47:14 | 19.10.2020 23:56:22   | Afib

PPG:

start               | end                   | type
19.10.2020 11:25:53 | 19.10.2020 11:40:24   | noise   
19.10.2020 22:49:53 | 19.10.2020 22:59:53   | Afib
19.10.2020 23:00:21 | 19.10.2020 23:15:53   | Afib
19.10.2020 23:42:04 | 19.10.2020 23:54:38   | Afib
20.10.2020 00:02:14 | 20.10.2020 00:19:26   | Afib

Each Row represents either one episode of Afib or one episode of noise (signal not good enough for detection). The measurement was continuous, but only arrhythmic events were documented.

We want to compare the second method to the first method to see if it would be a viable alternative to detect heart arrhythmia in patients. Hence we want to find:

  • true positives: Episodes which were detected in the goldstandard (ECG) and PPG (row 2 in the example above)

  • false positives: Episodes that were only detected using the PPG method. (row 5 in the example above)

and so forth...

Up until now I have changed the format of the timestamps, so that R will know that it is time and not just text, with the line:

ppg$Start<-dmy_hms(ppg$Start, tz=Sys.timezone())
ppg$End<-dmy_hms(ppg$End, tz=Sys.timezone())

leading to:

2020-10-19** 22:49:53 | 2020-10-19** 22:59:53 | Afib

The condition for a true positive is if an ECG episode overlaps with a PPG episode for 30 seconds.

How would I go and implement this to count true and false positives in R?

Thank you for your help.

The following function is probably too complicated but I think it does what the question asks for.
Its input arguments are

  • X a ECG data.frame
  • Y a PPG data.frame
  • duration Minimum duration
  • startcol name of the start datet imes column
  • endcol name of the end date times column
  • noisecol which column has the type , if it's "noise" count this row out
  • noiseval a vector of values not to be considered.

And the output is a list with members TP and FP .

overlapDuration <- function(X, Y, duration = 30, startcol, endcol, noisecol, noiseval){
  overlap_length <- function(x, y){
    if(int_overlaps(x, y)){
      xstart <- int_start(x)
      xend <- int_end(x)
      ystart <- int_start(y)
      yend <- int_end(y)
      start <- max(xstart, ystart)
      end <- min(xend, yend)
      int <- interval(start, end)
      int_length(int)
    } else NA
  }
  xname <- deparse(substitute(X))
  yname <- deparse(substitute(Y))
  Xi <- interval(X[[startcol]], X[[endcol]])
  Yi <- interval(Y[[startcol]], Y[[endcol]])
  overl <- sapply(Yi, \(x){
    sapply(Xi, overlap_length, x)
  })
  i <- which(X[[noisecol]] %in% noiseval)
  j <- which(Y[[noisecol]] %in% noiseval)
  overl[i, j] <- NA
  w <- which(!is.na(overl) & overl >= duration, arr.ind = TRUE)
  colnames(w) <- c(xname, yname)
  TP <- cbind(w, secs = overl[w])
  FP <- which(!(rownames(Y) %in% w[, yname] | Y[[noisecol]] %in% noiseval))
  list(TP = TP, FP = FP)
}

minduration <- 30
start <- "start"
end <- "end"
typecol <- "type"
noise <- "noise"
overlapDuration(ECG, PPG, minduration, start, end, typecol, noise)
#$TP
#     ECG PPG secs
#[1,]   2   2  600
#[2,]   3   3  632
#[3,]   4   4  444
#
#$FP
#[1] 5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM