For a model I'm building, I want to create multiple lag terms for every field/vector in my data table:
For example, with the following data table:
a<-c('x','x','x','y','y','y')
b<-runif(6, min=0, max=20)
c<-runif(6, min=50, max=1000)
df<-as.data.table(data.frame(a,b,c))
I can use the following code to create 2 lag terms for variable b within each group a :
df[,c(paste("b","_L",1:2,sep="")):=lapply(1:2, function(i) c(rep(NA, i),head(b, -i))),by=a]
However, my problem comes when I try to apply this code to a large data table (100+ variables), I would not want to repeat 100+ lines of code (1 line for each variable).
I tried to put the code inside of a loop with a list of variable names, but the variable names in the list cannot seem to be recognized or passed into the code properly:
looplist <- colnames(df[,!1])
for (l in looplist) {
df[,c(paste(l,"_L",1:2,sep="")):=lapply(1:2, function(i) c(rep(NA, i),head(l, -i))),by=a]
}
Any advice on how to make this loop work across variables, or any other methods to accomplish the same objective (create multiple LAG terms for each and every variable in the data table) will be greatly appreciated!
data.table
and Map
to handle the looping:
vars <- c("b","c")
rpv <- rep(1:2, each=length(vars))
df[, paste(vars, "lag", rpv, sep="_") := Map(shift, .SD, rpv), by=a, .SDcols=vars]
# a b c b_lag_1 c_lag_1 b_lag_2 c_lag_2
#1: x 10.863180 393.9568 NA NA NA NA
#2: x 6.139258 537.9199 10.863180 393.9568 NA NA
#3: x 11.896448 483.8036 6.139258 537.9199 10.86318 393.9568
#4: y 18.079188 509.6136 NA NA NA NA
#5: y 5.463224 233.6991 18.079188 509.6136 NA NA
#6: y 6.363724 869.8406 5.463224 233.6991 18.07919 509.6136
Here's a way to do it with dplyr
:
df %>%
group_by(a) %>%
mutate_all(funs(lag1 = lag(., 1), lag2 = lag(., 2)))
Output:
a b c b_lag1 c_lag1 b_lag2 c_lag2
<fctr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 x 6.663691 689.2483 NA NA NA NA
2 x 11.759130 397.8902 6.663691 689.2483 NA NA
3 x 3.888010 467.9758 11.759130 397.8902 6.663691 689.2483
4 y 6.221436 355.5437 NA NA NA NA
5 y 2.390940 701.2719 6.221436 355.5437 NA NA
6 y 17.141815 175.4642 2.390940 701.2719 6.221436 355.5437
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.