I am trying to use group_by within a function call in dplyr (R) and I am getting unexpected results. Here is an example of what I am trying to do:
df = data.frame(a = c(0,0,1,1), b = c(0,1,0,1), c = c(1,2,3,4))
result1 = df %>%
group_by(a,b) %>%
mutate(d = sum(c))
result1$d
myFunc <- function(df, var) {
output = df %>%
group_by(a,!!var) %>%
mutate(d = sum(c))
return(output)
}
result2 = myFunc(df,"b")
result2$d
result1$d yields [1,2,3,4] which is what I expected. result2$d yields [3,3,7,7] which I do not want, and I am not sure what is going on.
It works to have b (without quotes) as the function argument, and {{var}} in place of !!var. Unfortunately, in my case, my column names are in string format (but maybe there is a way to transform the string beforehand so that it will work with the {{}} notation?)
If you want to pass a character object that can refer to a certain column of a data frame, you should use !!sym(var)
:
myFunc <- function(df, var) {
output = df %>%
group_by(a, !!sym(var)) %>%
mutate(d = sum(c))
return(output)
}
myFunc(df, "b")
If you want to pass a data-masked argument, you should use {{ var }}
or equivalently !!enquo(var)
:
myFunc <- function(df, var) {
output = df %>%
group_by(a, {{ var }}) %>%
mutate(d = sum(c))
return(output)
}
myFunc(df, b)
Note that I pass "b"
and b
respectively into the function in the two different cases.
If we want to use quoting and unquoting instead of curlycurly {{}}
the we should consider this basic procedure: https://tidyeval.tidyverse.org/dplyr.html
Creating a function around dplyr pipelines involves three steps: abstraction, quoting, and unquoting.
1. Abstraction step:
var
in group_by
:2. Quoting step:
enquo()
to these arguments3. Unquoting step:
!!
. var
to group_by()
:myFunc <- function(df, var) {
var <- enquo(var)
output = df %>%
group_by(a,!!var) %>%
mutate(d = sum(c))
return(output)
}
result2 = myFunc(df,b)
output:
[1] 1 2 3 4
Just as I post a question, I come across something that works...
myFunc <- function(df, var) {
output = df %>%
group_by_at(.vars = c("a",var)) %>%
mutate(d = sum(c))
return(output)
}
result2 = myFunc(df,"b")
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.