简体   繁体   中英

Apply a rolling function by group in r (zoo, data.table)

I am having trouble doing something fairly simple: apply a rolling function (standard deviation) by group in a data.table. My problem is that when I use a data.table with rollapply by some column, data.table recycles the observations as noted in the warning message below. I would like to get NAs for the observations that are outside of the window instead of recycling the standard deviations.

This is my approach so far using iris, and a rolling window of size 2, aligned to the right:

library(zoo)
library(data.table)

A <- iris
setDT(A)
A[,stdev := rollapply(Petal.Width, width = 2, sd, align = 'right', partial = F),by = Species]
Warning messages:
1: In `[.data.table`(A, , `:=`(stdeev, rollapply(Petal.Width, width = 2,  :
  Supplied 49 items to be assigned to group 1 of size 50 in column 'stdeev' (recycled leaving remainder of 1 items).
2: In `[.data.table`(A, , `:=`(stdeev, rollapply(Petal.Width, width = 2,  :
  Supplied 49 items to be assigned to group 2 of size 50 in column 'stdeev' (recycled leaving remainder of 1 items).
3: In `[.data.table`(A, , `:=`(stdeev, rollapply(Petal.Width, width = 2,  :
  Supplied 49 items to be assigned to group 3 of size 50 in column 'stdeev' (recycled leaving remainder of 1 items).

> A
     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species     stdeev      stdev
  1:          5.1         3.5          1.4         0.2    setosa 0.00000000 0.00000000
  2:          4.9         3.0          1.4         0.2    setosa 0.00000000 0.00000000
  3:          4.7         3.2          1.3         0.2    setosa 0.00000000 0.00000000
  4:          4.6         3.1          1.5         0.2    setosa 0.00000000 0.00000000
  5:          5.0         3.6          1.4         0.2    setosa 0.14142136 0.14142136
 ---                                                                                  
146:          6.7         3.0          5.2         2.3 virginica 0.28284271 0.28284271
147:          6.3         2.5          5.0         1.9 virginica 0.07071068 0.07071068
148:          6.5         3.0          5.2         2.0 virginica 0.21213203 0.21213203
149:          6.2         3.4          5.4         2.3 virginica 0.35355339 0.35355339
150:          5.9         3.0          5.1         1.8 virginica 0.42426407 0.42426407

Add fill=NA to rollapply . This will ensure that a vector of length 50 (rather than 49) is returned, with NA as the first value (since align="right" ), avoiding recycling.

A[,stdev := rollapply(Petal.Width, width=2, sd, align='right', partial=F, fill=NA), by=Species]
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species stdev 1 5.1 3.5 1.4 0.2 setosa NA 2 4.9 3.0 1.4 0.2 setosa 0.00000000 3 4.7 3.2 1.3 0.2 setosa 0.00000000 ... 51 7.0 3.2 4.7 1.4 versicolor NA 52 6.4 3.2 4.5 1.5 versicolor 0.07071068 53 6.9 3.1 4.9 1.5 versicolor 0.00000000 ... 101 6.3 3.3 6.0 2.5 virginica NA 102 5.8 2.7 5.1 1.9 virginica 0.42426407 103 7.1 3.0 5.9 2.1 virginica 0.14142136 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM