[英]Apply function to grouped dataframe R
我正在嘗試按組拆分數據幀,然后應用一個函數,該函數將創建許多單獨的列表或數據幀。
例如,按 id 拆分下面的數據,然后為每 60 天的時間段創建單獨的列表或數據框。
id <- rep(1:5, each = 365)
date <- rep(seq(as.Date('2019-01-01'), as.Date('2019-12-31'), 'day'), 5)
x <- rnorm(1825)
y <- rnorm(1825)
df <- data.frame(id, date, x, y)
我試過將 group_split 或 split 與 seq.int 結合使用,我可以使用它只使用 1 個 id
df2 <- df %>%
filter(id == 1)
data <- list()
data <- lapply(
seq(1,length(df2$x)-(60-1)) #For the number of days - n - 1
,function(i) {
data[[i]] <- list('x'=df2$x[seq.int(i,i+(60-1))]
,'y' = df2$y[seq.int(i,i+(60-1))])
})
因此,最終輸出將是一組列表,如上面的輸出或嵌套數據幀
ID |last_date | data
1 | 2019-12-31| [60 x 3]
1 | 2019-12-30| [60 x 3]
1 | 2019-12-29| [60 x 3]
您可以嘗試split
然后嵌套lapply
:
nested <- do.call(rbind, lapply(split(df, df$id), function(x)
{
do.call(rbind, lapply(1:60, function(y)
{
z <- x[y + 0:59, ]
tibble(id = z$id[1], last_date = max(z$date), data = list(list(x = z$x, y = z$y)))
}))
}))
這給出了一個嵌套的數據框,每一行都有一個 id、一個日期以及一個 x 和 y 值的列表:
nested
#> # A tibble: 300 x 3
#> id last_date data
#> * <int> <date> <list>
#> 1 1 2019-03-01 <named list [2]>
#> 2 1 2019-03-02 <named list [2]>
#> 3 1 2019-03-03 <named list [2]>
#> 4 1 2019-03-04 <named list [2]>
#> 5 1 2019-03-05 <named list [2]>
#> 6 1 2019-03-06 <named list [2]>
#> 7 1 2019-03-07 <named list [2]>
#> 8 1 2019-03-08 <named list [2]>
#> 9 1 2019-03-09 <named list [2]>
#> 10 1 2019-03-10 <named list [2]>
#> # ... with 290 more rows
所以你可以做
nested$data[1]
#> [[1]]
#> [[1]]$`x`
#> [1] -0.186294037 0.407488434 0.521475261 -0.422258233 1.664796990 -0.316456771
#> [7] 0.182665242 -0.484338801 -0.192649909 0.873081270 -0.990823599 1.144433027
#> [13] 0.051712197 1.859142715 0.990093007 -0.001696676 -1.562290916 -0.476260992
#> [19] 0.975347849 0.084371694 -1.282503593 2.051669409 -0.703195871 0.350304665
#> [25] -0.324944027 1.640499226 1.197330101 -0.105973265 0.554276498 0.297189917
#> [31] 0.293502194 0.634043164 -0.322015474 -0.058275122 -0.410971343 0.309959510
#> [37] -1.379586045 -0.768224289 1.526995932 -0.981805376 -1.012230771 -0.364945605
#> [43] 1.130352216 1.131795697 -0.055765142 -1.517343421 0.282312602 -1.494675521
#> [49] -1.655192256 0.384317093 -0.518346889 0.828578708 1.071474898 1.365419431
#> [55] -0.348300271 -0.190303801 0.618120362 -0.969343280 -0.751382466 0.097511207
#>
#> [[1]]$y
#> [1] 0.241841347 -0.433619020 0.341544232 -0.443052330 1.418478139 -1.003223330
#> [7] -1.155966273 -0.862516431 -0.122768860 -0.082490747 0.456701203 0.405343576
#> [13] -2.106437081 0.645778514 -1.169678108 -1.451066029 -0.008333085 0.197081852
#> [19] -2.159107679 -2.167901928 0.062350030 0.507316009 0.077904318 -0.067838411
#> [25] -0.134667541 0.148749420 -0.463352528 0.293970945 -1.312431997 1.548834167
#> [31] -0.081291696 2.459888522 -0.105747872 -0.662130765 -1.127856102 0.037625236
#> [37] 0.378573145 0.574886376 -0.458236747 -1.402287567 0.703899240 1.532274574
#> [43] 0.654629245 0.762259424 0.001331954 0.619991076 -0.909183901 0.031380382
#> [49] -0.017907535 0.092751553 0.376905305 -1.104308942 -1.309079449 -0.252910625
#> [55] 0.991728742 0.217956094 -0.051243518 0.191618312 -0.633832253 -1.466263740
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.