简体   繁体   中英

dplyr mutate_at and rename together

I often run into the problem of having to recode multiple columns that follow the same structure and saving them into a column with a different name. If I could overwrite them, this would just be one line in dplyr , but since I also want to keep the original column, I don't know a good solution. Below an illustration.

This would be the long code the output of which I would like to replicate:

library(dplyr)
library(ggplot2)
data("diamonds")

diamonds <- diamonds %>% 
  mutate(x_char = case_when(x <= 4.5 ~ "low",
                       x >  4.5 & x < 7 ~ "so-so",
                       x >= 7 ~ "large",
                       TRUE ~ as.character(NA)),
         y_char = case_when(y <= 4.5 ~ "low",
                            y >  4.5 & y < 7 ~ "so-so",
                            y >= 7 ~ "large",
                            TRUE ~ as.character(NA)),
         z_char = case_when(z <= 4.5 ~ "low",
                            z >  4.5 & z < 7 ~ "so-so",
                            z >= 7 ~ "large",
                            TRUE ~ as.character(NA)))

This would be the short code with mutate_at that overwrites the original columns:

library(dplyr)
library(ggplot2)
data("diamonds")

diamonds <- diamonds %>%
  mutate_at(vars(x, y, z), ~ case_when(. <= 4.5 ~ "low",
                                       . >  4.5 & . < 7 ~ "so-so",
                                       . >= 7 ~ "large",
                                       TRUE ~ as.character(NA)))

Is there a way to keep the short code with mutate_at but change it in a way that the original columns are kept, and the new ones are saved with a different name? In the example that would mean adding _char at the end of the original column name and changing the recode according to the embedded formula.

try using across

library(tidyverse)

diamonds %>% 
  mutate(
    across(.cols = c(x, y, z),
           .fns = ~case_when(.x <= 4.5 ~ "low",
                             .x >  4.5 & x < 7 ~ "so-so",
                             .x >= 7 ~ "large",
                             TRUE ~ as.character(NA)),
           .names = "{.col}_char")
  )
#> # A tibble: 53,940 x 13
#>    carat cut     color clarity depth table price     x     y     z x_char y_char
#>    <dbl> <ord>   <ord> <ord>   <dbl> <dbl> <int> <dbl> <dbl> <dbl> <chr>  <chr> 
#>  1 0.23  Ideal   E     SI2      61.5    55   326  3.95  3.98  2.43 low    low   
#>  2 0.21  Premium E     SI1      59.8    61   326  3.89  3.84  2.31 low    low   
#>  3 0.23  Good    E     VS1      56.9    65   327  4.05  4.07  2.31 low    low   
#>  4 0.290 Premium I     VS2      62.4    58   334  4.2   4.23  2.63 low    low   
#>  5 0.31  Good    J     SI2      63.3    58   335  4.34  4.35  2.75 low    low   
#>  6 0.24  Very G~ J     VVS2     62.8    57   336  3.94  3.96  2.48 low    low   
#>  7 0.24  Very G~ I     VVS1     62.3    57   336  3.95  3.98  2.47 low    low   
#>  8 0.26  Very G~ H     SI1      61.9    55   337  4.07  4.11  2.53 low    low   
#>  9 0.22  Fair    E     VS2      65.1    61   337  3.87  3.78  2.49 low    low   
#> 10 0.23  Very G~ H     VS1      59.4    61   338  4     4.05  2.39 low    low   
#> # ... with 53,930 more rows, and 1 more variable: z_char <chr>

Created on 2021-03-09 by the reprex package (v1.0.0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM