简体   繁体   中英

How to loop lapply to create LAG terms over multiple variables in R

For a model I'm building, I want to create multiple lag terms for every field/vector in my data table:

For example, with the following data table:

    a<-c('x','x','x','y','y','y')  
    b<-runif(6, min=0, max=20)  
    c<-runif(6, min=50, max=1000)  
    df<-as.data.table(data.frame(a,b,c))    

I can use the following code to create 2 lag terms for variable b within each group a :

    df[,c(paste("b","_L",1:2,sep="")):=lapply(1:2, function(i) c(rep(NA, i),head(b, -i))),by=a]

However, my problem comes when I try to apply this code to a large data table (100+ variables), I would not want to repeat 100+ lines of code (1 line for each variable).

I tried to put the code inside of a loop with a list of variable names, but the variable names in the list cannot seem to be recognized or passed into the code properly:

    looplist <- colnames(df[,!1])  
    for (l in looplist) {
      df[,c(paste(l,"_L",1:2,sep="")):=lapply(1:2, function(i) c(rep(NA, i),head(l, -i))),by=a]
    } 

Any advice on how to make this loop work across variables, or any other methods to accomplish the same objective (create multiple LAG terms for each and every variable in the data table) will be greatly appreciated!

data.table and Map to handle the looping:

vars <- c("b","c")
rpv  <- rep(1:2, each=length(vars))
df[, paste(vars, "lag", rpv, sep="_") := Map(shift, .SD, rpv), by=a, .SDcols=vars]

#   a         b        c   b_lag_1  c_lag_1  b_lag_2  c_lag_2
#1: x 10.863180 393.9568        NA       NA       NA       NA
#2: x  6.139258 537.9199 10.863180 393.9568       NA       NA
#3: x 11.896448 483.8036  6.139258 537.9199 10.86318 393.9568
#4: y 18.079188 509.6136        NA       NA       NA       NA
#5: y  5.463224 233.6991 18.079188 509.6136       NA       NA
#6: y  6.363724 869.8406  5.463224 233.6991 18.07919 509.6136

Here's a way to do it with dplyr :

df %>%
    group_by(a) %>%
    mutate_all(funs(lag1 = lag(., 1), lag2 = lag(., 2)))

Output:

       a         b        c    b_lag1   c_lag1   b_lag2   c_lag2
  <fctr>     <dbl>    <dbl>     <dbl>    <dbl>    <dbl>    <dbl>
1      x  6.663691 689.2483        NA       NA       NA       NA
2      x 11.759130 397.8902  6.663691 689.2483       NA       NA
3      x  3.888010 467.9758 11.759130 397.8902 6.663691 689.2483
4      y  6.221436 355.5437        NA       NA       NA       NA
5      y  2.390940 701.2719  6.221436 355.5437       NA       NA
6      y 17.141815 175.4642  2.390940 701.2719 6.221436 355.5437

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM