簡體   English   中英

unnest_auto 和 unnest_longer 取消嵌套多個列

[英]unnest_auto and unnest_longer to unnest multiple columns

我有一個試圖取消嵌套的嵌套數據框。 這是一個假的結構示例。

df <- structure(list(`_id` = c("a", "b", "c", "d"), 
                     variable = list(structure(list(type = c("u", "a", "u", "a", "u", "a", "a"), 
                                                    m_ = c("m1",
                                                           "m2",
                                                           "m3",
                                                           "m4",
                                                           "m5",
                                                           "m6", "m7"), #omitted from original example by mistake 
                                                    t_ = c("2015-07-21 4:13 PM", 
                                                           "2016-04-21 7:25 PM", 
                                                           "2017-10-04 9:49 PM", 
                                                           "2018-12-04 12:29 PM", 
                                                           "2019-04-20 20:20 AM", 
                                                           "2016-05-20 12:00 AM", 
                                                           "2016-06-20 12:00 AM"), 
                                                    a_ = c(NA, 
                                                           "", 
                                                           NA, 
                                                           "", 
                                                           NA, 
                                                           "C", 
                                                           "C")), 
                                               class = "data.frame", 
                                               row.names = c(NA, 7L)), 
                                     structure(list(type = c("u", "a"), 
                                                    m_ = c("m1",
                                                           "m2"), 
                                                    t_ = c("2018-05-24 12:08 AM", 
                                                           "2019-04-24 3:05 PM"), 
                                                    a_ = c(NA, "")), 
                                               class = "data.frame", 
                                               row.names = 1:2), 
                                     structure(list(type = "u", 
                                                    m_ = "m1", 
                                                    t_ = "2018-02-17 3:14 PM"), 
                                               class = "data.frame", 
                                               row.names = 1L), 
                                     structure(list(type = "u", 
                                                    m_ = "m1",
                                                    t_ = "2016-05-27 5:14 PM",
                                                    b_ = "b1", 
                                                    i_ = "i1", 
                                                    e_ = structure(list(), 
                                                                   .Names = character(0), 
                                                                   class = "data.frame", 
                                                                   row.names = c(NA, -1L)), 
                                                    l_ = "l1"), 
                                               class = "data.frame", 
                                               row.names = 1L)),
                     myDate = structure(c(1521503311.992, 
                                          1521514011.161, 
                                          1551699584.65, 
                                          1553632693.94), 
                                        class = c("POSIXct", "POSIXt"))), 
                row.names = c(1L, 2L, 3L, 4L), 
                class = "data.frame")
View(df)

在此處輸入圖像描述

variable是長度不同的數據幀列表(在本例中最大字段為 7,但可以隨時間擴展)。

我嘗試使用tidyr的開發版本來利用新的unnest_auto()函數。

# devtools::install_github("tidyverse/tidyr")
df2 <- unnest_auto(df, variable)
View(df2)

在此處輸入圖像描述

如果我在結果上使用 unnest_longer 並指定一列,如type我得到它來擴展。

df3 <- unnest_longer(df2, type)

在此處輸入圖像描述

我沒有看到處理多列的unnest_longer()的任何參數。 有更好的方法嗎?

在這里,由於您要取消嵌套二維結構(即您想同時更改行和列),您可以只使用 unnest:

library(tidyr)

df <- as_tibble(df)
df
#> # A tibble: 4 × 3
#>   `_id` variable     myDate             
#>   <chr> <list>       <dttm>             
#> 1 a     <df [7 × 4]> 2018-03-19 18:48:31
#> 2 b     <df [2 × 4]> 2018-03-19 21:46:51
#> 3 c     <df [1 × 3]> 2019-03-04 05:39:44
#> 4 d     <df [1 × 7]> 2019-03-26 15:38:13

df |> 
  unnest(variable) 
#> # A tibble: 11 × 10
#>    `_id` type  m_    t_        a_    b_    i_    e_    l_    myDate             
#>    <chr> <chr> <chr> <chr>     <chr> <chr> <chr> <df[> <chr> <dttm>             
#>  1 a     u     m1    2015-07-…  <NA> <NA>  <NA>        <NA>  2018-03-19 18:48:31
#>  2 a     a     m2    2016-04-… ""    <NA>  <NA>        <NA>  2018-03-19 18:48:31
#>  3 a     u     m3    2017-10-…  <NA> <NA>  <NA>        <NA>  2018-03-19 18:48:31
#>  4 a     a     m4    2018-12-… ""    <NA>  <NA>        <NA>  2018-03-19 18:48:31
#>  5 a     u     m5    2019-04-…  <NA> <NA>  <NA>        <NA>  2018-03-19 18:48:31
#>  6 a     a     m6    2016-05-… "C"   <NA>  <NA>        <NA>  2018-03-19 18:48:31
#>  7 a     a     m7    2016-06-… "C"   <NA>  <NA>        <NA>  2018-03-19 18:48:31
#>  8 b     u     m1    2018-05-…  <NA> <NA>  <NA>        <NA>  2018-03-19 21:46:51
#>  9 b     a     m2    2019-04-… ""    <NA>  <NA>        <NA>  2018-03-19 21:46:51
#> 10 c     u     m1    2018-02-…  <NA> <NA>  <NA>        <NA>  2019-03-04 05:39:44
#> 11 d     u     m1    2016-05-…  <NA> b1    i1          l1    2019-03-26 15:38:13

如果您確實想分兩步完成,您可以利用unnest_longer()現在采用 tidyselect 規范的事實:

df |> 
  unnest_wider(variable) |> 
  unnest_longer(type:a_)
#> # A tibble: 11 × 10
#>    `_id` type  m_    t_        a_    b_    i_    e_    l_    myDate             
#>    <chr> <chr> <chr> <chr>     <chr> <chr> <chr> <df[> <chr> <dttm>             
#>  1 a     u     m1    2015-07-…  <NA> <NA>  <NA>        <NA>  2018-03-19 18:48:31
#>  2 a     a     m2    2016-04-… ""    <NA>  <NA>        <NA>  2018-03-19 18:48:31
#>  3 a     u     m3    2017-10-…  <NA> <NA>  <NA>        <NA>  2018-03-19 18:48:31
#>  4 a     a     m4    2018-12-… ""    <NA>  <NA>        <NA>  2018-03-19 18:48:31
#>  5 a     u     m5    2019-04-…  <NA> <NA>  <NA>        <NA>  2018-03-19 18:48:31
#>  6 a     a     m6    2016-05-… "C"   <NA>  <NA>        <NA>  2018-03-19 18:48:31
#>  7 a     a     m7    2016-06-… "C"   <NA>  <NA>        <NA>  2018-03-19 18:48:31
#>  8 b     u     m1    2018-05-…  <NA> <NA>  <NA>        <NA>  2018-03-19 21:46:51
#>  9 b     a     m2    2019-04-… ""    <NA>  <NA>        <NA>  2018-03-19 21:46:51
#> 10 c     u     m1    2018-02-…  <NA> <NA>  <NA>        <NA>  2019-03-04 05:39:44
#> 11 d     u     m1    2016-05-…  <NA> b1    i1          l1    2019-03-26 15:38:13

這似乎有效:

df %>% unnest_auto(variable) %>% unnest()

#Warning message:
#`cols` is now required.
#Please use `cols = c(type, m_, t_, a_, e_)`

df %>% unnest_auto(variable) %>% unnest(cols = c(type, m_, t_, a_, e_, l_))

# A tibble: 11 x 10
   `_id` type  m_    t_     a_    b_    i_    e_    l_    myDate             
   <chr> <chr> <chr> <chr>  <chr> <chr> <chr> <???> <chr> <dttm>             
 1 a     u     m1    2015-… NA    NA    NA    NA    NA    2018-03-20 02:48:31
 2 a     a     m2    2016-… ""    NA    NA    NA    NA    2018-03-20 02:48:31
 3 a     u     m3    2017-… NA    NA    NA    NA    NA    2018-03-20 02:48:31
 4 a     a     m4    2018-… ""    NA    NA    NA    NA    2018-03-20 02:48:31
 5 a     u     m5    2019-… NA    NA    NA    NA    NA    2018-03-20 02:48:31
 6 a     a     m6    2016-… C     NA    NA    NA    NA    2018-03-20 02:48:31
 7 a     a     m7    2016-… C     NA    NA    NA    NA    2018-03-20 02:48:31
 8 b     u     m1    2018-… NA    NA    NA    NA    NA    2018-03-20 05:46:51
 9 b     a     m2    2019-… ""    NA    NA    NA    NA    2018-03-20 05:46:51
10 c     u     m1    2018-… NA    NA    NA    NA    NA    2019-03-04 14:39:44
11 d     u     m1    2016-… NA    b1    i1    NA    l1    2019-03-26 23:38:13

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM