简体   繁体   English

使用lapply和子集的子集数据帧

[英]Subset data frame using lapply and subset

I have this data frame: 我有这个数据框:

structure(list(ABEV3 = c(15.2, 14.9, 15.22, 15.15, 15.18, 15.46, 
15.49, 15.5, 15.37, 15.49, 15.64, 15.38, 15.3, 15.01, 14.75, 
14.9, 14.77, 14.61, 14.21, 14.07, 14.1, 14.17, 14.55, 14.57, 
16.46), AEDU3 = c(9.01, 8.56, 8.66, 8.64, 8.44, 8.52, 8.29, 8.27, 
8.33, 8.26, 8.66, 8.49, 8.46, 8.4, 8.5, 8.46, 8.4, 8.39, 8.5, 
8.68, 8.53, 8.73, 8.31, 7.85, 10.99), ALLL3 = c(7.71, 7.81, 7.57, 
7.27, 7.29, 7.07, 7.11, 7.17, 7.27, 7.24, 7.1, 7.1, 7.1, 7.14, 
6.79, 6.65, 6.75, 6.93, 7.09, 7.11, 6.95, 6.75, 7, 6.8, 6.64), 
    BBAS3 = c(22.85, 22.78, 22.8, 22.22, 22.51, 21.11, 20.84, 
    20.79, 20.67, 20.9, 19.82, 18.95, 18.7, 18.84, 19.13, 19.25, 
    19.22, 19.38, 19.56, 19.92, 20.37, 20.37, 19.96, 19.19, 19.47
    )), class = "data.frame", row.names = c(NA, 25L))

I like to slice this data frame in 10 others dataframe, which will be my samples, and put it in a list. 我喜欢在另外10个数据帧中切割这个数据帧,这将是我的样本,并将其放在一个列表中。

I did this: 我这样做了:

library(dplyr)

k_day_regressions = c(5,8,10,12,14,16,18,20,22,25)
dataraw.samples<-list()

for (i in 1:length(k_day_regressions)) {

  dataraw.samples[[i]]= slice(dataraw.1, 1:k_day_regressions[i])

}

dataraw.samples

So, I have 10 samples. 所以,我有10个样本。

How can I do this using LAPPLY function with subset function. 如何使用具有子集功能的LAPPLY函数执行此操作。 Iam doing this and its not working. 我这样做而且不起作用。

Thanks 谢谢

干得好:

lapply(k_day_regressions, function(x) slice(dataraw.1, 1:x) )

You can do this with base R and no loops. 您可以使用基本R和无循环执行此操作。 Create a splitting variable f and then split the data frame. 创建拆分变量f ,然后split数据框。

d <- diff(c(0, k_day_regressions))
f <- rep.int(rep(1:length(d)), times = d)
dataraw.samples <- split(dataraw.1, f)

dataraw.samples[[1]]
#  ABEV3 AEDU3 ALLL3 BBAS3
#1 15.20  9.01  7.71 22.85
#2 14.90  8.56  7.81 22.78
#3 15.22  8.66  7.57 22.80
#4 15.15  8.64  7.27 22.22
#5 15.18  8.44  7.29 22.51

Some alternative approaches. 一些替代方法。

One with dplyr : 一个用dplyr

library(dplyr)

data.frame(k_day_regressions) %>%
  rowwise() %>%
  mutate(data = list(df[1:k_day_regressions,])) %>%
  ungroup()

# # A tibble: 10 x 2
#     k_day_regressions data                 
#                 <dbl> <list>               
#   1                 5 <data.frame [5 x 4]> 
#   2                 8 <data.frame [8 x 4]> 
#   3                10 <data.frame [10 x 4]>
#   4                12 <data.frame [12 x 4]>
#   5                14 <data.frame [14 x 4]>
#   6                16 <data.frame [16 x 4]>
#   7                18 <data.frame [18 x 4]>
#   8                20 <data.frame [20 x 4]>
#   9                22 <data.frame [22 x 4]>
#   10               25 <data.frame [25 x 4]>

You can save that as df2 and access each sub-dataframe as df2$data[[1]] , etc. 您可以将其保存为df2并将每个子数据帧作为df2$data[[1]]等访问。

One with purrr : 一个有purrr

library(purrr)

map(k_day_regressions, ~df[1:.,])

which returns a list with 10 elements (sub-dataframes). 返回包含10个元素(子数据帧)的列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM