計算 df 中每個變量的多個滯后並將結果存儲到嵌套列表中

Question

我有一個 df （ data ），我想將其作為參數傳遞給 function fun.lag_cols以計算（對於 df 中的每一列）幾個滯后。 結果必須存儲在嵌套列表中，但我的 function 似乎缺少（至少）一步。

data <- data.frame(x1 = rnorm(10,0,1)
               , x2 = rnorm(10,2,3)
               , x3 = rnorm(10,6,1))

fun.lag_cols <- function(x, lag_from = 0, lag_to = 2) {
  x <- as.data.frame(x)
  cols_x <- ncol(x)
  lst_lag <- list()
  
  for (i in 1:cols_x) {
    for(j in lag_from:lag_to) {
      lst_lag[[i]] <- dplyr::lag(x[,i],j)
    }
    
  }
  return(lst_lag)
}

output <- fun.lag_cols(data)

在此特定示例中，我希望將output視為 3 個元素（x1、x2、x3）的列表，每個元素都是 3 個新列表（每個滯后 0、1、2 一個）。

我的代碼似乎只存儲每個變量的 lag2（通常是最大滯后），顯然不是預期的結果。

我對不同的方法持開放態度，只要它們提供最終的 output（嵌套列表）。

謝謝

Answer 1

我們可以通過將元素與嵌套循環內的lag值連接起來來更改“lst_lag[[i]]”的分配。 In the function, there are two changes - 1) initialize an output list with predefined length ( vector('list', ncol(x)) ), 2) inside the nested loop, where we append those i th list elements with new child通過將已經存在的list與通過將lag包裝在list中而創建的新list連接起來來列出元素，同時遞歸地更新相同的列表元素 ( <- )

fun.lag_cols <- function(x, lag_from = 0, lag_to = 2) {
  x <- as.data.frame(x)
  cols_x <- ncol(x)
  lst_lag <- vector('list', ncol(x))
  
  for (i in 1:cols_x) {
    for(j in lag_from:lag_to) {
      lst_lag[[i]] <- c(lst_lag[[i]], list(dplyr::lag(x[,i],j)))
    }
    
  }
  return(lst_lag)
}

-測試

fun.lag_cols(data)
[[1]]
[[1]][[1]]
 [1] -1.40431393 -2.22551238  0.06090537  0.77941726  1.10733091  1.20657717  0.71614034 -0.17990135  0.22058894  0.33598415

[[1]][[2]]
 [1]          NA -1.40431393 -2.22551238  0.06090537  0.77941726  1.10733091  1.20657717  0.71614034 -0.17990135  0.22058894

[[1]][[3]]
 [1]          NA          NA -1.40431393 -2.22551238  0.06090537  0.77941726  1.10733091  1.20657717  0.71614034 -0.17990135


[[2]]
[[2]][[1]]
 [1]  1.1334651  1.2385579  1.8930347 -4.7379766  2.0169352  0.7210822 -1.0322536  4.5446643  1.4421923  1.1316508

[[2]][[2]]
 [1]         NA  1.1334651  1.2385579  1.8930347 -4.7379766  2.0169352  0.7210822 -1.0322536  4.5446643  1.4421923

[[2]][[3]]
 [1]         NA         NA  1.1334651  1.2385579  1.8930347 -4.7379766  2.0169352  0.7210822 -1.0322536  4.5446643


[[3]]
[[3]][[1]]
 [1] 4.324912 5.114774 4.517017 7.001338 5.218430 4.408571 7.233504 6.875883 5.848294 4.696724

[[3]][[2]]
 [1]       NA 4.324912 5.114774 4.517017 7.001338 5.218430 4.408571 7.233504 6.875883 5.848294

[[3]][[3]]
 [1]       NA       NA 4.324912 5.114774 4.517017 7.001338 5.218430 4.408571 7.233504 6.875883

已經有一個 function 可用於執行此操作，即shift data.table轉換，它采用矢量化n

library(data.table)
shift(data, n = 0:2)

Answer 2

使用lapply ：

fun.lag_cols <- function(x, lag_from = 0, lag_to = 2) {
  val <- lag_from:lag_to
  lapply(x, function(v) 
    setNames(lapply(val, function(n) dplyr::lag(v, n)), paste0('lag_', val)))
}

fun.lag_cols(data)

#$x1
#$x1$lag_0
# [1] -1.5095832 -0.2638919  0.5986575  3.3043298  0.9471048 -1.2154015
# [7]  0.8921754 -1.6614204 -0.2036500  0.9570701

#$x1$lag_1
# [1]         NA -1.5095832 -0.2638919  0.5986575  3.3043298  0.9471048
# [7] -1.2154015  0.8921754 -1.6614204 -0.2036500

#$x1$lag_2
# [1]         NA         NA -1.5095832 -0.2638919  0.5986575  3.3043298
# [7]  0.9471048 -1.2154015  0.8921754 -1.6614204


#$x2
#$x2$lag_0
# [1] -4.8181366  4.1741754  4.6560021 -0.5167334  1.5284542  8.7717049
# [7] -0.2104695  2.4273092  1.4985899  2.7356401

#$x2$lag_1
# [1]         NA -4.8181366  4.1741754  4.6560021 -0.5167334  1.5284542
# [7]  8.7717049 -0.2104695  2.4273092  1.4985899

#$x2$lag_2
# [1]         NA         NA -4.8181366  4.1741754  4.6560021 -0.5167334
# [7]  1.5284542  8.7717049 -0.2104695  2.4273092

#$x3
#$x3$lag_0
# [1] 7.712619 5.237124 5.798063 5.695696 5.127347 3.789074 5.830557
# [8] 3.801073 5.794048 5.227110

#$x3$lag_1
# [1]       NA 7.712619 5.237124 5.798063 5.695696 5.127347 3.789074
# [8] 5.830557 3.801073 5.794048

#$x3$lag_2
# [1]       NA       NA 7.712619 5.237124 5.798063 5.695696 5.127347
# [8] 3.789074 5.830557 3.801073

計算 df 中每個變量的多個滯后並將結果存儲到嵌套列表中

問題描述

2 個解決方案

解決方案1
1 已采納 2021-03-09 23:14:45

解決方案2
1 2021-03-10 03:57:49

計算 df 中每個變量的多個滯后並將結果存儲到嵌套列表中

問題描述

2 個解決方案

解決方案1 1 已采納 2021-03-09 23:14:45

解決方案2 1 2021-03-10 03:57:49

解決方案1
1 已采納 2021-03-09 23:14:45

解決方案2
1 2021-03-10 03:57:49