简体   繁体   中英

An efficient way to create a list of lists in R

I want to create a list of lists from a data frame. I can do it with a for loop:

n <- 5
df <- data.frame(x = rnorm(n), y = rnorm(n), N = sample(10:50,n))

expList <- vector("list", n)
for (i in 1:n)
{
  expList[[i]]$par$x <- df$x[i]
  expList[[i]]$par$y <- df$y[i]
  expList[[i]]$N <- df$N[i]
  class(expList[[i]]) <- c(class(expList[[i]]), "Experiment")
}

The result should look like this:

expList

[[1]]
$par
$par$x
[1] 2.574112

$par$y
[1] -2.33903


$N
[1] 36

attr(,"class")
[1] "list"       "Experiment"

[[2]]
$par
$par$x
[1] -0.264593

$par$y
[1] 0.5924768
    .........

I am looking for an efficient way of creating this list (suppose n = 10e7 ). Something like this: expList[1:n]$par$x <- df$x (I know this is wrong).

You can Map . Combined with a constructor function for your class and do.call , this is very concise and appears to be a few times faster than the solution in the question.

experiment<-function(x,y,N)
  structure(list(par=list(x=x,y=y),N=N),class="Experiment")

L<-do.call(Map,c(f=experiment,df))
str(L)
List of 5
 $ :List of 2
  ..$ par:List of 2
  .. ..$ x: num -0.754
  .. ..$ y: num -0.768
  ..$ N  : int 27
  ..- attr(*, "class")= chr "Experiment"
 $ :List of 2
  ..$ par:List of 2
  .. ..$ x: num 0.487
  .. ..$ y: num -1.31
  ..$ N  : int 23
  ..- attr(*, "class")= chr "Experiment"
 $ :List of 2
  ..$ par:List of 2
  .. ..$ x: num -0.653
  .. ..$ y: num -0.2
  ..$ N  : int 35
  ..- attr(*, "class")= chr "Experiment"
 $ :List of 2
  ..$ par:List of 2
  .. ..$ x: num -0.687
  .. ..$ y: num -0.441
  ..$ N  : int 17
  ..- attr(*, "class")= chr "Experiment"
 $ :List of 2
  ..$ par:List of 2
  .. ..$ x: num -0.0851
  .. ..$ y: num -0.665
  ..$ N  : int 24
  ..- attr(*, "class")= chr "Experiment"

Data

df<-structure(list(x = c(-0.754391843396212, 0.487237170179346, -0.653098590457105, 
-0.686632907020112, -0.0850559453983232), y = c(-0.767944417138587, 
-1.31042221234913, -0.199621075494168, -0.441313470125542, -0.664834248101919
), N = c(27L, 23L, 35L, 17L, 24L)), .Names = c("x", "y", "N"), row.names = c(NA, 
-5L), class = "data.frame")

Would you be OK with a list of each row of the DF? ie

for (i in 1:n) {
  expList[[i]] <- df[i, ]
}

Because you can still access those variables within each list. Or was your example data not the same (similar format) as your real data?

# library(dplyr), otherwise just do lapply(split(df, 1:n), as.list)
split(df, 1:n) %>% lapply(as.list) 

> str(.Last.value)
List of 5
 $ 1:List of 3
  ..$ x: num 0.979
  ..$ y: num -0.358
  ..$ N: int 23
 $ 2:List of 3
  ..$ x: num -0.0297
  ..$ y: num 0.589
  ..$ N: int 21
 So on...

Operate on each list with lapply(function (x) {blah}) . I feel that having 10e7 lists in R might not be ideal.

Try this solution:

apply(df, 1L, function(x) {
    result <- as.list(x)
    class(result) <- c(class(result), "Experiment")
    result
})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM