简体   繁体   中英

R find max(or min) of dynamic window of time series

I've a time series with two columns, one column contains a "signal" either NA or not (then it's an integer between 1 and 5, I don't care about the actual value unless it differs from NA) and the second column contains the actual value.

I need to calculate the highest value since the signal was not NA. This is shown in the example below.

    Date        Sig Val     Expected result
    2008-01-01  1   47      47  <<- Sig==1, i. e. here we start counting
    2008-01-02  NA  31      47
    2008-01-03  NA  61      61  <<- 61 is higher than 47, so this one is important now
    2008-01-04  NA  43      61  
    2008-01-05  NA  23      61
    2008-01-06  NA  46      61
    2008-01-07  NA  17      61
    2008-01-08  NA  52      61
    2008-01-09  NA  84      84  <<- a new high, value should be kept
    2008-01-10  NA  54      84
    2008-01-11  1   30      30  <<- a new signal, here we start counting again
    2008-01-12  NA  36      36  <<- a new higher value in this segment
    2008-01-13  NA  59      59  <<- again a new higher value in this segment
    2008-01-14  NA  56      59
    2008-01-15  NA  15      59
    2008-01-16  NA  21      59
    2008-01-17  NA  87      87
    2008-01-18  NA  81      87
    2008-01-19  2   94      94  <<- a new signal this time a 2, so here we start counting again
    2008-01-20  NA  42      94
    2008-01-21  NA  95      95
    2008-01-22  1   42      42  <<- a new signal, here we start counting again
    2008-01-23  NA  25      42
    2008-01-24  NA  20      42
    2008-01-25  NA  76      76
    2008-01-26  NA  95      95
    2008-01-27  NA  14      95
    2008-01-28  NA  12      95
    2008-01-29  NA  13      95
    2008-01-30  NA  57      95
    2008-01-31  NA  26      95

The problem is that the relevant observation window is dynamic, depending in signal-column.

I've played around with the runMax(), cummax() and rollapply() functions but they only work on a specific window length ... I guess I can't see the wood for the trees, but I can't figure out how to make the "look-back" window dynamic. Any guess?

Create a column that creates the signal groups, then use cummax

Using data.table (assuming your data is in a data.frame df)

library(data.table)

D <- as.data.table(df)
D[, list(maxvalue = cummax(value)), 
    by = list(sig2 = cumsum(replace(sig, is.na(sig), 0)))]

the same can be done using dplyr

say u have a tibble with ur signal and a price

df<-tibble(
  px=c(1,cumprod(exp(rnorm(999,mean=0.0001,sd=0.01)))),
  sig=runif(length(px))>0.95
) 

grouping by period and applying cummax works: (a grouping column is created)

mutate(group_by(df,g=cumsum(sig)),rmax=cummax(px))

alternatively, accumulate2 works without grouping

mutate(df,rmax=unlist(accumulate2(px,tail(sig,-1),~..3*..1+(1-..3)*max(..1,..2))))

this is a bit more flexible, the function can be adapted for more elaborate rolling calculations

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM