简体   繁体   English

如何“传播”列表列?

[英]how to “spread” a list-column?

Consider this simple example 考虑这个简单的例子

mydf <- data_frame(regular_col = c(1,2),
                   normal_col = c('a','b'),
                   weird_col = list(list('hakuna', 'matata'),
                                 list('squash', 'banana')))

> mydf
# A tibble: 2 x 3
  regular_col normal_col weird_col 
        <dbl> <chr>      <list>    
1           1 a          <list [2]>
2           2 b          <list [2]>

I would like to extract the elements of weird_col (programmatically, the number of elements may change) so that each element is placed on a different column. 我想提取weird_col的元素(以编程方式,元素的数量可能会改变),以便每个元素放在不同的列上。 That is, I expect the following output 也就是说,我期待以下输出

> data_frame(regular_col = c(1,2),
+           normal_col = c('a','b'),
+           weirdo_one = c('hakuna', 'squash'),
+           weirdo_two = c('matata', 'banana'))
# A tibble: 2 x 4
  regular_col normal_col weirdo_one weirdo_two
        <dbl> <chr>      <chr>      <chr>     
1           1 a          hakuna     matata
2           2 b          squash     banana    

However, I am unable to do so in simple terms. 但是,我无法用简单的方式这样做。 For instance, using the classic unnest fails here, as it expands the dataframe instead of placing each element of the list in a different column. 例如,使用经典的unnest失败,因为它扩展了数据框而不是将列表的每个元素放在不同的列中。

> mydf %>% unnest(weird_col)
# A tibble: 4 x 3
  regular_col normal_col weird_col
        <dbl> <chr>      <list>   
1           1 a          <chr [1]>
2           1 a          <chr [1]>
3           2 b          <chr [1]>
4           2 b          <chr [1]>

Is there any solution in the tidyverse for that? 对于那个tidyverse有什么解决方案吗?

You can extract the values from the output of unnest , process a little to make your column names, and then spread back out. 您可以从unnest的输出中提取值,稍微处理以生成列名,然后再spread Note that I use flatten_chr because of your depth-one list-column, but if it is nested you can use flatten and spread works just as well on list-cols. 请注意,我使用flatten_chr是因为你的深度列表列,但如果它是嵌套的,你可以使用flattenspread也可以在list-cols上使用。

library(tidyverse)
#> Warning: package 'dplyr' was built under R version 3.5.1
mydf <- data_frame(
  regular_col = c(1, 2),
  normal_col = c("a", "b"),
  weird_col = list(
    list("hakuna", "matata"),
    list("squash", "banana")
  )
)
mydf %>%
  unnest(weird_col) %>%
  group_by(regular_col, normal_col) %>%
  mutate(
    weird_col = flatten_chr(weird_col),
    weird_colname = str_c("weirdo_", row_number())
    ) %>% # or just as.character
  spread(weird_colname, weird_col)
#> # A tibble: 2 x 4
#> # Groups:   regular_col, normal_col [2]
#>   regular_col normal_col weirdo_1 weirdo_2
#>         <dbl> <chr>      <chr>    <chr>   
#> 1           1 a          hakuna   matata  
#> 2           2 b          squash   banana

Created on 2018-08-12 by the reprex package (v0.2.0). reprex包 (v0.2.0)于2018-08-12创建。

unnest develops lists and vectors vertically, and one row data frames horizontally. unnest垂直开发列表和向量,水平一行数据帧。 So what we can do is change your lists into data frames (with adequate column names) and unnest afterwards. 所以我们所能做的就是改变你的名单到数据帧(有足够的列名)和unnest之后。

mydf %>% mutate(weird_col = map(weird_col,~ as_data_frame(
  setNames(.,paste0("weirdo_",1:length(.)))
  ))) %>% 
  unnest

# # A tibble: 2 x 4
#   regular_col normal_col weirdo_1 weirdo_2
#         <dbl>      <chr>    <chr>    <chr>
# 1           1          a   hakuna   matata
# 2           2          b   squash   banana

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM