简体   繁体   中英

R user-defined/dynamic summary function within dplyr::summarise

Somewhat hard to define this question without sounding like lots of similar questions!

I have a function for which I want one of the parameters to be a function name, that will be passed to dplyr::summarise, eg "mean" or "sum":

data(mtcars)
  f <- function(x = mtcars,
                groupcol = "cyl",
                zCol = "disp",
                zFun = "mean") {
    
    zColquo = quo_name(zCol)
    
    cellSummaries <- x %>%
      group_by(gear, !!sym(groupcol)) %>% # 1 preset grouper, 1 user-defined
      summarise(Count = n(), # 1 preset summary, 1 user defined
                !!zColquo := mean(!!sym(zColquo))) # mean should be zFun, user-defined
    ungroup
  }

(this groups by gear and cyl, then returns, per group, count and mean(disp))

Per my note, I'd like 'mean' to be dynamic, performing the function defined by zFun, but I can't for the life of me work out how to do it. Thanks in advance for any advice.

You can use match.fun to make the function dynamic. I also removed zColquo as it's not needed.

library(dplyr)
library(rlang)

f <- function(x = mtcars,
              groupcol = "cyl",
              zCol = "disp",
              zFun = "mean") {

  cellSummaries <- x %>%
                   group_by(gear, !!sym(groupcol)) %>% 
                   summarise(Count = n(), 
                             !!zCol := match.fun(zFun)(!!sym(zCol))) %>%
                   ungroup

  return(cellSummaries)
}

You can then check output

f()

# A tibble: 8 x 4
#   gear   cyl Count  disp
#  <dbl> <dbl> <int> <dbl>
#1     3     4     1  120.
#2     3     6     2  242.
#3     3     8    12  358.
#4     4     4     8  103.
#5     4     6     4  164.
#6     5     4     2  108.
#7     5     6     1  145 
#8     5     8     2  326 

f(zFun = "sum")

# A tibble: 8 x 4
#   gear   cyl Count  disp
#  <dbl> <dbl> <int> <dbl>
#1     3     4     1  120.
#2     3     6     2  483 
#3     3     8    12 4291.
#4     4     4     8  821 
#5     4     6     4  655.
#6     5     4     2  215.
#7     5     6     1  145 
#8     5     8     2  652 

We can use get

library(dplyr)    
f <- function(x = mtcars,
            groupcol = "cyl",
            zCol = "disp",
            zFun = "mean") {


  zColquo = quo_name(zCol)
  x %>%
  group_by(gear, !!sym(groupcol)) %>% # 1 preset grouper, 1 user-defined
  summarise(Count = n(), # 1 preset summary, 1 user defined
            !!zColquo := get(zFun)(!!sym(zCol))) %>% 
ungroup
 }

f()
# A tibble: 8 x 4
#   gear   cyl Count  disp
#  <dbl> <dbl> <int> <dbl>
#1     3     4     1  120.
#2     3     6     2  242.
#3     3     8    12  358.
#4     4     4     8  103.
#5     4     6     4  164.
#6     5     4     2  108.
#7     5     6     1  145 
#8     5     8     2  326 


f(zFun = "sum")
# A tibble: 8 x 4
#   gear   cyl Count  disp
#  <dbl> <dbl> <int> <dbl>
#1     3     4     1  120.
#2     3     6     2  483 
#3     3     8    12 4291.
#4     4     4     8  821 
#5     4     6     4  655.
#6     5     4     2  215.
#7     5     6     1  145 
#8     5     8     2  652 

In addition, we could remove the sym evaluation in group_by and in summarise if we wrap with across

f <- function(x = mtcars,
            groupcol = "cyl",
            zCol = "disp",
            zFun = "mean") {



 x %>%
    group_by(across(c(gear, groupcol))) %>% # 1 preset grouper, 1 user-defined
    summarise(Count = n(), # 1 preset summary, 1 user defined
            across(zCol, ~ get(zFun)(.))) %>% 
    ungroup
 }
f()
# A tibble: 8 x 4
#   gear   cyl Count  disp
#  <dbl> <dbl> <int> <dbl>
#1     3     4     1  120.
#2     3     6     2  242.
#3     3     8    12  358.
#4     4     4     8  103.
#5     4     6     4  164.
#6     5     4     2  108.
#7     5     6     1  145 
#8     5     8     2  326 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM