简体   繁体   中英

flatten list column within dataframe in R

I have a data frame with two columns containing lists.

test <- data.frame(var_1 = c("ONE", "TWO"))

test$var_2 <- c(list(data.frame(Date = c("Date 1", "Date 2")), list(data.frame(Date = c("Date 3", "Date 4", "Date 5")))))                   
test$var_3 <- c(list(data.frame(Name = c("Name 1", "Name 2")), list(data.frame(Date = c("Name 3", "Name 4", "Name 5")))))                   

I'd like to loop over the lists to create a flattened data frame, repeating var_1 for the length of the list of each row

Desired Output

data.frame(var_1 = c("ONE", "ONE", "TWO", "TWO", "TWO"),
           Date = c("Date 1", "Date 2", "Date 3", "Date 4", "Date 5"),
           Name = c("Name 1", "Name 2", "Name 3", "Name 4", "Name 5"))


  var_1   Date   Name
1   ONE Date 1 Name 1
2   ONE Date 2 Name 2
3   TWO Date 3 Name 3
4   TWO Date 4 Name 4
5   TWO Date 5 Name 5

I tried applying some of the proposed solutions like using unnest from this post Flatten list column in data frame with ID column but this didn't quite give me my desired result.

Here is one option where we unlist the 'var_2', 'var_3', and unnest

library(dplyr)
library(purrr)
library(tidyr)
test %>% 
    group_split(var_1) %>%
    map_dfr(~ .x %>% 
                mutate_at(-1, ~ list(unlist(.))) %>% 
                unnest(c(var_2, var_3)))
# A tibble: 5 x 3
#  var_1 var_2  var_3 
#  <fct> <fct>  <fct> 
#1 ONE   Date 1 Name 1
#2 ONE   Date 2 Name 2
#3 TWO   Date 3 Name 3
#4 TWO   Date 4 Name 4
#5 TWO   Date 5 Name 5

Or we can do

test %>%
     rowwise %>%
     summarise_all(~ list(unlist(.))) %>%
     unnest(cols = everything())
# A tibble: 5 x 3
#  var_1 var_2  var_3 
#  <fct> <fct>  <fct> 
#1 ONE   Date 1 Name 1
#2 ONE   Date 2 Name 2
#3 TWO   Date 3 Name 3
#4 TWO   Date 4 Name 4
#5 TWO   Date 5 Name 5

Or with

test  %>% 
    group_by(var_1) %>%
    nest %>% 
    mutate(data = map(data, ~ summarise_all(.x, ~ list(unlist(.))) %>% 
    unnest(everything())))   %>% 
    unnest(data)

An option using data.table :

setDT(test)[, c(.(var_1=var_1), do.call(cbind.data.frame, list(var_2, var_3))), var_1][, 
    (1L) := NULL][]

output:

   var_1   Date   Name
1:   ONE Date 1 Name 1
2:   ONE Date 2 Name 2
3:   TWO Date 3 Name 3
4:   TWO Date 4 Name 4
5:   TWO Date 5 Name 5

data:

test <- data.frame(var_1 = c("ONE", "TWO"))

test$var_2 <- c(list(data.frame(Date = c("Date 1", "Date 2")), list(data.frame(Date = c("Date 3", "Date 4", "Date 5")))))                   
test$var_3 <- c(list(data.frame(Name = c("Name 1", "Name 2")), list(data.frame(Name = c("Name 3", "Name 4", "Name 5")))))                   

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM