I'd like to create new column in data.frame using dplyr::mutate
using custom function which argument is vector of data.frame's column names but I got following output:
customFun <- function(col.vec) {
paste0(gsub("\\s", "_", col.vec), collapse = "-")
}
df <- data.frame(A = c("x 1", "x", "x w"), B = c("E", "D", "2 w"), stringsAsFactors = FALSE)
df %>%
mutate(C = customFun(c(A, B)))
A B C
1 x 1 E x_1-x-x_w-E-D-2_w
2 x D x_1-x-x_w-E-D-2_w
3 x w 2 w x_1-x-x_w-E-D-2_w
instead of:
data.table::data.table(df)[, C := customFun(c(A, B)), by = .(A, B)]
A B C
1: x 1 E x_1-E
2: x D x-D
3: x w 2 w x_w-2_w
It can be achieved in many ways, but I'm interested in dplyr
solution only. Thank you for your help.
We can use map
and lift_dl
. We first map
over each col.vec
(notice I have used a list instead of a vector as input, since c
flattens any vector elements, while list doesn't) and apply gsub
. Then the list output is fed into paste
. Since paste
takes ...
, we can use purrr::lift_dl
to lift it's input domain from ...
to list
type:
library(dplyr)
library(purrr)
customFun <- function(col.vec) {
map(col.vec, ~gsub("\\s", "_", .x)) %>%
lift_dl(paste, sep = "-")()
}
df %>%
mutate(C = customFun(list(A, B)))
or with ...
as input:
customFun <- function(...) {
col.vec <- list(...)
map(col.vec, ~gsub("\\s", "_", .x)) %>%
lift_dl(paste, sep = "-")()
}
df %>%
mutate(C = customFun(A, B))
Output:
A B C
1 x 1 E x_1-E
2 x D x-D
3 x w 2 w x_w-2_w
Why use by=.(..)
in your data.table
solution? If at all you have two rows with exactly similar values, then these will be collapsed into one. You need to modify your customFun
. It is not correct the way it is:
library(tidyverse)
customFun = function(data) invoke(paste, data.frame(gsub('\\s+', '_', as.matrix(data))), sep='-')
df %>%
mutate(c = customFun(.))
A B C
1 x 1 E x_1-E
2 x D x-D
3 x w 2 w x_w-2_w
You can replace the invoke with do.call
or even lift
etc.
Your function is not doing exactly what you want. Read the comment above
Just add rowwise
before your mutate
so only each row's A & B values are used in paste
, rather than the vectors of all rows.
library(dplyr)
df %>%
rowwise() %>%
mutate(C = customFun(c(A, B)))
#> Source: local data frame [3 x 3]
#> Groups: <by row>
#>
#> # A tibble: 3 x 3
#> A B C
#> <chr> <chr> <chr>
#> 1 x 1 E x_1-E
#> 2 x D x-D
#> 3 x w 2 w x_w-2_w
Created on 2019-02-05 by the reprex package (v0.2.1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.