[英]how to “spread” a list-column?
Consider this simple example 考虑这个简单的例子
mydf <- data_frame(regular_col = c(1,2),
normal_col = c('a','b'),
weird_col = list(list('hakuna', 'matata'),
list('squash', 'banana')))
> mydf
# A tibble: 2 x 3
regular_col normal_col weird_col
<dbl> <chr> <list>
1 1 a <list [2]>
2 2 b <list [2]>
I would like to extract the elements of weird_col
(programmatically, the number of elements may change) so that each element is placed on a different column. 我想提取
weird_col
的元素(以编程方式,元素的数量可能会改变),以便每个元素放在不同的列上。 That is, I expect the following output 也就是说,我期待以下输出
> data_frame(regular_col = c(1,2),
+ normal_col = c('a','b'),
+ weirdo_one = c('hakuna', 'squash'),
+ weirdo_two = c('matata', 'banana'))
# A tibble: 2 x 4
regular_col normal_col weirdo_one weirdo_two
<dbl> <chr> <chr> <chr>
1 1 a hakuna matata
2 2 b squash banana
However, I am unable to do so in simple terms. 但是,我无法用简单的方式这样做。 For instance, using the classic
unnest
fails here, as it expands the dataframe instead of placing each element of the list in a different column. 例如,使用经典的
unnest
失败,因为它扩展了数据框而不是将列表的每个元素放在不同的列中。
> mydf %>% unnest(weird_col)
# A tibble: 4 x 3
regular_col normal_col weird_col
<dbl> <chr> <list>
1 1 a <chr [1]>
2 1 a <chr [1]>
3 2 b <chr [1]>
4 2 b <chr [1]>
Is there any solution in the tidyverse
for that? 对于那个
tidyverse
有什么解决方案吗?
You can extract the values from the output of unnest
, process a little to make your column names, and then spread
back out. 您可以从
unnest
的输出中提取值,稍微处理以生成列名,然后再spread
。 Note that I use flatten_chr
because of your depth-one list-column, but if it is nested you can use flatten
and spread
works just as well on list-cols. 请注意,我使用
flatten_chr
是因为你的深度列表列,但如果它是嵌套的,你可以使用flatten
和spread
也可以在list-cols上使用。
library(tidyverse)
#> Warning: package 'dplyr' was built under R version 3.5.1
mydf <- data_frame(
regular_col = c(1, 2),
normal_col = c("a", "b"),
weird_col = list(
list("hakuna", "matata"),
list("squash", "banana")
)
)
mydf %>%
unnest(weird_col) %>%
group_by(regular_col, normal_col) %>%
mutate(
weird_col = flatten_chr(weird_col),
weird_colname = str_c("weirdo_", row_number())
) %>% # or just as.character
spread(weird_colname, weird_col)
#> # A tibble: 2 x 4
#> # Groups: regular_col, normal_col [2]
#> regular_col normal_col weirdo_1 weirdo_2
#> <dbl> <chr> <chr> <chr>
#> 1 1 a hakuna matata
#> 2 2 b squash banana
Created on 2018-08-12 by the reprex package (v0.2.0). 由reprex包 (v0.2.0)于2018-08-12创建。
unnest
develops lists and vectors vertically, and one row data frames horizontally. unnest
垂直开发列表和向量,水平一行数据帧。 So what we can do is change your lists into data frames (with adequate column names) and unnest
afterwards. 所以我们所能做的就是改变你的名单到数据帧(有足够的列名)和
unnest
之后。
mydf %>% mutate(weird_col = map(weird_col,~ as_data_frame(
setNames(.,paste0("weirdo_",1:length(.)))
))) %>%
unnest
# # A tibble: 2 x 4
# regular_col normal_col weirdo_1 weirdo_2
# <dbl> <chr> <chr> <chr>
# 1 1 a hakuna matata
# 2 2 b squash banana
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.