简体   繁体   中英

Lists of Lists, Convert to Data Frame in R

I have something that looks like this:

Group1 <- list(Date=c("a","b","c"), Name=c("a2","b2"), Age=c("a3","b3","c3","d3"))
Group2 <- list(Date=c("a","b","c"), Name=c("a2","b2","b3"), Age=c("a3","b3","c3","d3"))
Group3 <- list(Date=c("a","b","c"), Name=c("a2","b2"), Age=c("a3","b3"))
all <- list(Group1,Group2,Group3)
all

What I need is to add NAs so that each list of Dates, Names, and Ages are equal length. I then need to convert this to a data frame.

I am stuck with how to add the NAs since I have lists within lists. I will have over 1,000 "Groups" with lists of data in them (always the same Date, Name, Age categories, so this length doesn't change). The longest list within these Groups should always be 4 for the current example, so anything less should have NAs. I've seen a code like this, which is close but does not work for lists within lists:

## Compute maximum length
max.length <- max(sapply(all, length))
## Add NA values to list elements
l <- lapply(all, function(v) { c(v, rep(NA, max.length-length(v)))})

Is there something similar I can do for my current data set?

We could try to combine purrr and plyr :

plyr::ldply(purrr::map(all_list,unlist),function(x) rbind(x,NA))

Output:

 #    .id Date1 Date2 Date3 Name1 Name2 Age1 Age2 Age3 Age4 Name3
#1     1     a     b     c    a2    b2   a3   b3   c3   d3  <NA>
#2     1  <NA>  <NA>  <NA>  <NA>  <NA> <NA> <NA> <NA> <NA>  <NA>
#3     2     a     b     c    a2    b2   a3   b3   c3   d3    b3
#4     2  <NA>  <NA>  <NA>  <NA>  <NA> <NA> <NA> <NA> <NA>  <NA>
#5     3     a     b     c    a2    b2   a3   b3 <NA> <NA>  <NA>
#6     3  <NA>  <NA>  <NA>  <NA>  <NA> <NA> <NA> <NA> <NA>  <NA>
names(all) <- 1:length(all) #Will help us latter in bind_rows

Transfer each list element into a valid dataframe

all_mod <- lapply(all,function(x){
           #browser()
           max.length<-max(sapply(x, length))
           data.frame(sapply(x, function(v) {c(v, rep(NA, max.length-length(v)))}), stringsAsFactors = FALSE)
           })

Finally binds all elements together using bind_rows and use .id to identify the dataframes

library(dplyr)
bind_rows(all_mod, .id = 'ID')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM