简体   繁体   中英

Flattening a dataframe with nested lists containing NULL values in R

I would like to condense this dataframe which is an output from reading a JSON file to remove the NULL values. Currently when unnesting the dataframe I end up with unnecessary NA values. I tried compact in purr and I tried some is.null operations to no avail.

The dataframe output after reading the json file looks something like this:

    col1        col2       col3
1   c(a,b,c)    NULL       NULL
2   NULL        c(d,e,f)   NULL
3   NULL        NULL       c(g,h,i)

And I want it to look like this:

    col1        col2       col3
1   c(a,b,c)    c(d,e,f)   c(g,h,i)

So that when I unnest the dataframe I get:

    col1   col2   col3
1   a      d      g
2   b      e      h
3   c      f      i

I know there must be a correct function for this, but I still haven't found it, Apologies for any formatting errors on this post, still new to this!

If I understand your question correctly, your data looks like this:

tibble(col1 = list(c("a", "b", "c"), NULL, NULL), 
       col2 = list(NULL, c("d", "e", "f"), NULL),
       col3 = list(NULL, NULL, c("g", "h", "i"))) 

# A tibble: 3 x 3
  col1      col2      col3     
  <list>    <list>    <list>   
1 <chr [3]> <NULL>    <NULL>   
2 <NULL>    <chr [3]> <NULL>   
3 <NULL>    <NULL>    <chr [3]>

That is, something like a dataframe or tibble with list columns, where some of the cells contain some (character) vector and other are NULL and you want to reduce this to a single row with only the vector and leave the nulls out.

My solution is to use dplyr::summarise and purrr::compact to reduce the rows to a single row and to get rid of those pesky NULL s respectively:

tibble(col1 = list(c("a", "b", "c"), NULL, NULL), 
       col2 = list(NULL, c("d", "e", "f"), NULL),
       col3 = list(NULL, NULL, c("g", "h", "i"))) %>% 
    summarise_all(compact) 

# A tibble: 1 x 3
  col1      col2      col3     
  <list>    <list>    <list>   
1 <chr [3]> <chr [3]> <chr [3]>

Unnesting this will give you

# A tibble: 3 x 3
  col1  col2  col3 
  <chr> <chr> <chr>
1 a     d     g    
2 b     e     h    
3 c     f     i    

Note: This only works as long as you have a single non-null in each of your column.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM