Convert R data frame into list of vectors

Question

I have a data frame (imported from an Excel worksheet where I have written a lists of strings row by row) and want to convert the rows into a list of vectors where each vector contains the non-missing cell values for that row:

eg:

#Sample data frame
dfX <- data.frame(C0 = c(1,2,3),
              C1 = c("Apple","Apple","Pear"),
              C2 = c("Banana","Orange", "Lemon"),
              C3 = c("Pear","Melon", ""))

Which would be used to generate the following list:

myList = list(c("Apple","Banana", "Pear"),
          c("Apple","Orange", "Melon"),
          c("Pear","Lemon"))

Note the third vector is truncated to two elements as the cell contains an empty string. Also note that the index (C0) is dropped.

I have seen some examples which convert the data frame to a matrix and use the split function to then paste the results into the global environment, eg

list2env(setNames(split(as.matrix(dfX),
                    row(dfX)), paste0("Row",1:3)),
                    envir=.GlobalEnv)

But I was wondering if there were (a) a newer tidyverse function for handling this and (b) a way to populate straight to a list (I later want to lapply a function against that list). Also want the missing values handling on the way into the list if possible!

Answer 1

As you are interested in tidyverse way, one option would be

library(tidyverse)

dfX %>%
  group_split(C0) %>% #Or use split(.$C0) if `dplyr` is not updated
  map(~discard(flatten_chr(.), . == "")[-1])

#[[1]]
#[1] "Apple"  "Banana" "Pear"  

#[[2]]
#[1] "Apple"  "Orange" "Melon" 

#[[3]]
#[1] "Pear"  "Lemon"

group_split is available in dplyr 0.8.0 . Also this assumes that you would have unique C0 in every row and for every row we discard any value which is equal to empty strings ("").

Or in base R combination of split and lapply would also work.

lapply(split(dfX[-1], dfX$C0), function(x) x[x != ""])

#$`1`
#[1] "Apple"  "Banana" "Pear"  

#$`2`
#[1] "Apple"  "Orange" "Melon" 

#$`3`
#[1] "Pear"  "Lemon"

Another base R option is apply with MARGIN = 1

apply(dfX[-1], 1, function(x) x[x!= ""])

Answer 2

A base R option is by

by(dfX, dfX$C0, function(x) unlist(x[x != ''][-1]))
#dfX$C0: 1
#[1] "Apple"  "Banana" "Pear"
#------------------------------------------------------------
#dfX$C0: 2
#[1] "Apple"  "Orange" "Melon"
#------------------------------------------------------------
#dfX$C0: 3
#[1] "Pear"  "Lemon"

by returns a "dressed" list, ignoring the attributes this is the same as your expected myList .

Convert R data frame into list of vectors

Question

2 answers

solution1
2 ACCPTED 2019-03-19 02:39:03

solution2
1 2019-03-19 02:47:07

Convert R data frame into list of vectors

Question

2 answers

solution1 2 ACCPTED 2019-03-19 02:39:03

solution2 1 2019-03-19 02:47:07

solution1
2 ACCPTED 2019-03-19 02:39:03

solution2
1 2019-03-19 02:47:07