简体   繁体   中英

R dplyr mutate multiple columns using custom function to create new column

I'd like to create new column in data.frame using dplyr::mutate using custom function which argument is vector of data.frame's column names but I got following output:

customFun <- function(col.vec) {
  paste0(gsub("\\s", "_", col.vec), collapse = "-")
}

df <- data.frame(A = c("x 1", "x", "x w"), B = c("E", "D", "2 w"), stringsAsFactors = FALSE)

df %>%
   mutate(C = customFun(c(A, B)))
    A   B                 C
1 x 1   E x_1-x-x_w-E-D-2_w
2   x   D x_1-x-x_w-E-D-2_w
3 x w 2 w x_1-x-x_w-E-D-2_w

instead of:

data.table::data.table(df)[, C := customFun(c(A, B)), by = .(A, B)]
     A   B       C
1: x 1   E   x_1-E
2:   x   D     x-D
3: x w 2 w x_w-2_w

It can be achieved in many ways, but I'm interested in dplyr solution only. Thank you for your help.

We can use map and lift_dl . We first map over each col.vec (notice I have used a list instead of a vector as input, since c flattens any vector elements, while list doesn't) and apply gsub . Then the list output is fed into paste . Since paste takes ... , we can use purrr::lift_dl to lift it's input domain from ... to list type:

library(dplyr)
library(purrr)

customFun <- function(col.vec) {
  map(col.vec, ~gsub("\\s", "_", .x)) %>%
    lift_dl(paste, sep = "-")()
}

df %>%
  mutate(C = customFun(list(A, B)))

or with ... as input:

customFun <- function(...) {
  col.vec <- list(...)
  map(col.vec, ~gsub("\\s", "_", .x)) %>%
    lift_dl(paste, sep = "-")()
}

df %>%
  mutate(C = customFun(A, B))

Output:

    A   B       C
1 x 1   E   x_1-E
2   x   D     x-D
3 x w 2 w x_w-2_w

Why use by=.(..) in your data.table solution? If at all you have two rows with exactly similar values, then these will be collapsed into one. You need to modify your customFun . It is not correct the way it is:

library(tidyverse)
customFun = function(data) invoke(paste, data.frame(gsub('\\s+', '_', as.matrix(data))), sep='-')

df %>% 
    mutate(c = customFun(.))

    A   B       C
1 x 1   E   x_1-E
2   x   D     x-D
3 x w 2 w x_w-2_w

You can replace the invoke with do.call or even lift etc.

Your function is not doing exactly what you want. Read the comment above

Just add rowwise before your mutate so only each row's A & B values are used in paste , rather than the vectors of all rows.

library(dplyr)

df %>%
  rowwise() %>%
  mutate(C = customFun(c(A, B)))
#> Source: local data frame [3 x 3]
#> Groups: <by row>
#> 
#> # A tibble: 3 x 3
#>   A     B     C      
#>   <chr> <chr> <chr>  
#> 1 x 1   E     x_1-E  
#> 2 x     D     x-D    
#> 3 x w   2 w   x_w-2_w

Created on 2019-02-05 by the reprex package (v0.2.1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM