简体   繁体   English

R {dplyr}: `rowwise` 列表列中的`rename` 或`mutate` data.frames 在LHS 上有不同的列名

[英]R {dplyr}: `rename` or `mutate` data.frames in `rowwise` list-column with different column names on LHS

I'm playing around with list-columns of data.frames with {dplyr} 1.0.0 and I'm wondering whether it is possible to rename() and mutate() columns in each data.frame without leaving the pipe when the nested data.frame is grouped rowwise .我正在使用 {dplyr} 1.0.0 使用data.frameslist-columns ,我想知道是否可以在每个data.framerename()mutate()列而不离开 pipe 嵌套时rowwise按行分组。

Why do I want to know / do this?为什么我想知道/这样做? As far I understand the philosophy of {dplyr} 1.0.0 it is recommending rowwise() instead of using {purrr}'s map -family on columns.据我了解 {dplyr} 1.0.0 的理念,它推荐rowwise()而不是在列上使用 {purrr} 的map -family。 Below I first show what I did before {dplyr} 1.0.0 and then show a couple of examples (most of them not working) for {dplyr} 1.0.0.下面我首先展示我在 {dplyr} 1.0.0 之前所做的事情,然后展示 {dplyr} 1.0.0 的几个示例(大多数都不起作用)。

While {rlang} supports glue strings on the left hand side (LHS) which can be used when writing {dplyr} custom functions, the LHS of {dplyr} functions in a rowwise tibble seems not to be supported yet (at least my examples below are not working).虽然{rlang} 支持左侧的胶水字符串(LHS) ,可在编写 {dplyr} 自定义函数时使用,但似乎尚不支持按rowwise中的 { tibble } 函数的 LHS(至少我下面的示例不工作)。

For rename I found a way using rename_with() , but I have no idea how to get it working with mutate .对于rename ,我找到了一种使用rename_with()的方法,但我不知道如何让它与mutate一起使用。

I also do not understand most of the error message I get.我也不明白我收到的大部分错误信息。 They more or less say that I'm not using a string on the LHS before := but in rowwise mode my referenced column ( new ) is actually a character vector of length == 1 .他们或多或少说我之前没有在 LHS 上使用字符串:=但在rowwise模式下,我引用的列( new )实际上是length == 1的字符向量。

library(dplyr, quietly = TRUE, warn.conflicts = FALSE)
library(purrr)

myiris <- iris %>% 
  nest_by(Species, .key = "mydat") %>% 
  ungroup %>% 
  mutate(new = letters[1:3])

# our data looks like this
# we want to use the strings in column `new` on the LHS of `rename` and `mutate`
myiris
#> # A tibble: 3 x 3
#>   Species                 mydat new  
#>   <fct>      <list<tbl_df[,4]>> <chr>
#> 1 setosa               [50 x 4] a    
#> 2 versicolor           [50 x 4] b    
#> 3 virginica            [50 x 4] c

# For reference: under dplyr < 1.0 I did the following:

# rename in pipe
# working
myiris %>% 
  mutate(mydat = map2(mydat, new,
                      ~ rename_at(.x, "Sepal.Length", function(z) paste(.y)))) %>% 
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 4
#>       a Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   5.1         3.5          1.4         0.2
#> 2   4.9         3            1.4         0.2
#> 3   4.7         3.2          1.3         0.2
#> 4   4.6         3.1          1.5         0.2
#> # ... with 46 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 4
#>       b Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   7           3.2          4.7         1.4
#> 2   6.4         3.2          4.5         1.5
#> 3   6.9         3.1          4.9         1.5
#> 4   5.5         2.3          4           1.3
#> # ... with 46 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 4
#>       c Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   6.3         3.3          6           2.5
#> 2   5.8         2.7          5.1         1.9
#> 3   7.1         3            5.9         2.1
#> 4   6.3         2.9          5.6         1.8
#> # ... with 46 more rows

# mutate in pipe
# was never working even under dplyr < 1.0.0
myiris %>% 
  mutate(mydat = map2(mydat, new,
                      ~ mutate(.x, eval(.y) := .y))) %>% 
  pull(mydat)
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `map2(mydat, new, ~mutate(.x, `:=`(eval(.y), .y)))`.

# mutate with custom function
# working
mymutate <- function(df, y) {
  mutate(df, !! y := y)
}

myiris %>% 
  mutate(mydat = map2(mydat, new,
                      ~ mymutate(.x, .y))) %>% 
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width a    
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>
#> 1          5.1         3.5          1.4         0.2 a    
#> 2          4.9         3            1.4         0.2 a    
#> 3          4.7         3.2          1.3         0.2 a    
#> 4          4.6         3.1          1.5         0.2 a    
#> # ... with 46 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width b    
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>
#> 1          7           3.2          4.7         1.4 b    
#> 2          6.4         3.2          4.5         1.5 b    
#> 3          6.9         3.1          4.9         1.5 b    
#> 4          5.5         2.3          4           1.3 b    
#> # ... with 46 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width c    
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>
#> 1          6.3         3.3          6           2.5 c    
#> 2          5.8         2.7          5.1         1.9 c    
#> 3          7.1         3            5.9         2.1 c    
#> 4          6.3         2.9          5.6         1.8 c    
#> # ... with 46 more rows





# dplyr > 1.0.0
# objective: `rename()` or `mutate()` in pipe on list-column of data.frames 
#            while using different column names on LHS coming from another
#            column (here `new`)

myiris_row <- myiris %>% rowwise

# rename --------
# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename({{new}} := "Sepal.Length"))) 
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(...)`.
#> i The error occured in row 1.

# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename(!! new := "Sepal.Length")))  
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(...)`.
#> i The error occured in row 1.

# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename(!! sym(new) := "Sepal.Length")))  
#> Error: Only strings can be converted to symbols

# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename(all_of(new) := "Sepal.Length")))  
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(mydat %>% rename(`:=`(all_of(new), "Sepal.Length")))`.
#> i The error occured in row 1.

# working, but only with `rename_with()`
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename_with(~ new, "Sepal.Length")))  %>%
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 4
#>       a Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   5.1         3.5          1.4         0.2
#> 2   4.9         3            1.4         0.2
#> 3   4.7         3.2          1.3         0.2
#> 4   4.6         3.1          1.5         0.2
#> # ... with 46 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 4
#>       b Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   7           3.2          4.7         1.4
#> 2   6.4         3.2          4.5         1.5
#> 3   6.9         3.1          4.9         1.5
#> 4   5.5         2.3          4           1.3
#> # ... with 46 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 4
#>       c Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   6.3         3.3          6           2.5
#> 2   5.8         2.7          5.1         1.9
#> 3   7.1         3            5.9         2.1
#> 4   6.3         2.9          5.6         1.8
#> # ... with 46 more rows


# mutate ------
# the values of the new column don't matter
# here we just use the same input as the name, to show that RHS evaluation is easier.

# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% mutate(!! new := new))) 
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(...)`.
#> i The error occured in row 1.

# not working
myiris %>% 
  mutate(mydat = list(mydat %>% mutate(!! sym(new) := new))) 
#> Error: Only strings can be converted to symbols

# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% mutate(all_of(new) := new))) 
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(mydat %>% mutate(`:=`(all_of(new), new)))`.
#> i The error occured in row 1.

# almost working (what's going on in the data[[1]] btw!)
myiris_row %>% 
  mutate(mydat = list(mydat %>% mutate("{{new}}" := new)))  %>%
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width `promise_fn(3L)`
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>           
#> 1          5.1         3.5          1.4         0.2 a               
#> 2          4.9         3            1.4         0.2 a               
#> 3          4.7         3.2          1.3         0.2 a               
#> 4          4.6         3.1          1.5         0.2 a               
#> # ... with 46 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width `"b"`
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>
#> 1          7           3.2          4.7         1.4 b    
#> 2          6.4         3.2          4.5         1.5 b    
#> 3          6.9         3.1          4.9         1.5 b    
#> 4          5.5         2.3          4           1.3 b    
#> # ... with 46 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width `"c"`
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>
#> 1          6.3         3.3          6           2.5 c    
#> 2          5.8         2.7          5.1         1.9 c    
#> 3          7.1         3            5.9         2.1 c    
#> 4          6.3         2.9          5.6         1.8 c    
#> # ... with 46 more rows

Created on 2020-12-22 by the reprex package (v0.3.0)代表 package (v0.3.0) 于 2020 年 12 月 22 日创建

You can protect your !!你可以保护你的!! from the outside call by using quote() , and then use !!使用quote()从外部调用,然后使用!! again in your nested call to unquote it:再次在您的嵌套调用中取消引用它:

myiris_row %>% 
  mutate(mydat = list(mydat %>% mutate(!! quote(!!new) := new))) %>%
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width a    
#>           <dbl>       <dbl>        <dbl>       <dbl> <chr>
#>  1          5.1         3.5          1.4         0.2 a    
#>  2          4.9         3            1.4         0.2 a    
#>  3          4.7         3.2          1.3         0.2 a    
#>  4          4.6         3.1          1.5         0.2 a    
#>  5          5           3.6          1.4         0.2 a    
#>  6          5.4         3.9          1.7         0.4 a    
#>  7          4.6         3.4          1.4         0.3 a    
#>  8          5           3.4          1.5         0.2 a    
#>  9          4.4         2.9          1.4         0.2 a    
#> 10          4.9         3.1          1.5         0.1 a    
#> # ... with 40 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width b    
#>           <dbl>       <dbl>        <dbl>       <dbl> <chr>
#>  1          7           3.2          4.7         1.4 b    
#>  2          6.4         3.2          4.5         1.5 b    
#>  3          6.9         3.1          4.9         1.5 b    
#>  4          5.5         2.3          4           1.3 b    
#>  5          6.5         2.8          4.6         1.5 b    
#>  6          5.7         2.8          4.5         1.3 b    
#>  7          6.3         3.3          4.7         1.6 b    
#>  8          4.9         2.4          3.3         1   b    
#>  9          6.6         2.9          4.6         1.3 b    
#> 10          5.2         2.7          3.9         1.4 b    
#> # ... with 40 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width c    
#>           <dbl>       <dbl>        <dbl>       <dbl> <chr>
#>  1          6.3         3.3          6           2.5 c    
#>  2          5.8         2.7          5.1         1.9 c    
#>  3          7.1         3            5.9         2.1 c    
#>  4          6.3         2.9          5.6         1.8 c    
#>  5          6.5         3            5.8         2.2 c    
#>  6          7.6         3            6.6         2.1 c    
#>  7          4.9         2.5          4.5         1.7 c    
#>  8          7.3         2.9          6.3         1.8 c    
#>  9          6.7         2.5          5.8         1.8 c    
#> 10          7.2         3.6          6.1         2.5 c    
#> # ... with 40 more rows
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename(!! quote(!!new) := "Sepal.Length"))) %>%
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 4
#>        a Sepal.Width Petal.Length Petal.Width
#>    <dbl>       <dbl>        <dbl>       <dbl>
#>  1   5.1         3.5          1.4         0.2
#>  2   4.9         3            1.4         0.2
#>  3   4.7         3.2          1.3         0.2
#>  4   4.6         3.1          1.5         0.2
#>  5   5           3.6          1.4         0.2
#>  6   5.4         3.9          1.7         0.4
#>  7   4.6         3.4          1.4         0.3
#>  8   5           3.4          1.5         0.2
#>  9   4.4         2.9          1.4         0.2
#> 10   4.9         3.1          1.5         0.1
#> # ... with 40 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 4
#>        b Sepal.Width Petal.Length Petal.Width
#>    <dbl>       <dbl>        <dbl>       <dbl>
#>  1   7           3.2          4.7         1.4
#>  2   6.4         3.2          4.5         1.5
#>  3   6.9         3.1          4.9         1.5
#>  4   5.5         2.3          4           1.3
#>  5   6.5         2.8          4.6         1.5
#>  6   5.7         2.8          4.5         1.3
#>  7   6.3         3.3          4.7         1.6
#>  8   4.9         2.4          3.3         1  
#>  9   6.6         2.9          4.6         1.3
#> 10   5.2         2.7          3.9         1.4
#> # ... with 40 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 4
#>        c Sepal.Width Petal.Length Petal.Width
#>    <dbl>       <dbl>        <dbl>       <dbl>
#>  1   6.3         3.3          6           2.5
#>  2   5.8         2.7          5.1         1.9
#>  3   7.1         3            5.9         2.1
#>  4   6.3         2.9          5.6         1.8
#>  5   6.5         3            5.8         2.2
#>  6   7.6         3            6.6         2.1
#>  7   4.9         2.5          4.5         1.7
#>  8   7.3         2.9          6.3         1.8
#>  9   6.7         2.5          5.8         1.8
#> 10   7.2         3.6          6.1         2.5
#> # ... with 40 more rows

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM