简体   繁体   中英

Computing rolling mean in data.table with adaptive window lengths

I am looking to compute a moving average by group in a data.table with an adaptive window so that there are no NAs at the beginning of the time series. I know how to do this with frollmean and setting adaptive = TRUE (see for instance jangorecki's response in this thread). I can get the same code to work when all groups in my data.table are of the same length but run into errors when the groups are of different size.

For example, if my data is

tmp = data.table(Gp = c(rep('A',6),rep('B',4)), Val = c(1,3,4,6,2,2,8,5,7,10))

and I am doing a moving average of length 3, then the desired response is

> desired_output
    Gp  Val
 1:  A 1.00
 2:  A 2.00
 3:  A 2.67
 4:  A 4.33
 5:  A 4.00
 6:  A 3.33
 7:  B 8.00
 8:  B 6.50
 9:  B 6.67
10:  B 7.33

I tried the following:

mov_window_len = vector("list",2)
mov_window_len[[1]] = c(1,2,rep(3,4))
mov_window_len[[2]] = c(1,2,rep(3,2))
tmp[,lapply(.SD, frollmean, n = mov_window_len, align = "right", adaptive = TRUE), by = Gp]

but I get an error saying length of integer vector(s) provided as list to 'n' argument must be equal to number of observations provided in 'x'

Any help in resolving this will be much appreciated. Thanks in advance.

You can use the group index .GRP to subset mov_window_len . This will give you the right lengths for each group. You only want to take frollmean of Val , so no need for lapply .

tmp[, frollmean(Val, n = mov_window_len[.GRP], align = "right", adaptive = TRUE), by = Gp]

#     Gp       V1
#  1:  A 1.000000
#  2:  A 2.000000
#  3:  A 2.666667
#  4:  A 4.333333
#  5:  A 4.000000
#  6:  A 3.333333
#  7:  B 8.000000
#  8:  B 6.500000
#  9:  B 6.666667
# 10:  B 7.333333

Alternatively window length can be added to input data.table ( Len field below), as it corresponds to each row.

tmp[Gp=="A", Len:=mov_window_len[[1]]
    ][Gp=="B", Len:=mov_window_len[[2]]
     ][, .(Val, Len, RollVal=frollmean(Val, Len, adaptive=TRUE)), by=Gp]
#    Gp Val Len  RollVal
# 1:  A   1   1 1.000000
# 2:  A   3   2 2.000000
# 3:  A   4   3 2.666667
# 4:  A   6   3 4.333333
# 5:  A   2   3 4.000000
# 6:  A   2   3 3.333333
# 7:  B   8   1 8.000000
# 8:  B   5   2 6.500000
# 9:  B   7   3 6.666667
#10:  B  10   3 7.333333

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM