简体   繁体   中英

data.table: fast calculate statistics of rows time within bidirectional time moving window

library(data.table)
library(lubridate)
df <- data.table(col1 = c('A', 'A', 'A', 'B', 'B', 'B'), col2 = c("2015-03-06 01:37:57", "2015-03-06 01:39:57", "2015-03-06 01:45:28", "2015-03-06 02:31:44", "2015-03-06 03:55:45", "2015-03-06 04:01:40"))

For each row I want to calculate standard deviation of time(col2) of rows with same values of 'col1' and time within window of past 10 minutes before time of this row(include) and next 10 minutes after time of this row(include)

I try to use fast approach based on solution of previous question

df$col2 <- as_datetime(df$col2)
gap <- 10L
df[, feat1 := .SD[.(col1 = col1, t1 = col2 - gap * 60L, t2 = col2 + gap * 60L)
                  , on = .(col1, col2 >= t1, col2 <= t2)
                  , .(col1, col2 = x.col2, times = as.numeric(col2))
                  ][, .(sd_times = sd(times))
                    , by = .(col1, col2)]$sd_times][]

But I'v got next mistake:

Error in vecseq(f__, len__, if (allow.cartesian || notjoin || !anyDuplicated(f__,  : 
  Join results in 14 rows; more than 12 = nrow(x)+nrow(i). Check for duplicate key values in i each of which join to the same group in x over and over again. If that's ok, try by=.EACHI to run j for each group to avoid the large allocation. If you are sure you wish to proceed, rerun with allow.cartesian=TRUE. Otherwise, please search for this error message in the FAQ, Wiki, Stack Overflow and datatable-help for advice.

I have solved my task using Frank comment above:

df[, feat1 := .SD[.(col1 = col1, t1 = col2 - gap * 60L, t2 = col2 + gap * 60L)
                  , on = .(col1, col2 >= t1, col2 <= t2)
                  , .(col1, col2 = x.col2, times = as.numeric(col2)), allow.cartesian=TRUE
                  ][, .(sd_times = sd(times))
                    , by = .(col1, col2)]$sd_times][]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM