简体   繁体   English

将不同的函数列表传递给 dplyr 总结

[英]pass a varying list of functions to dplyr summarize

Is it possible to pass a list of functions to dplyr::summarize in a way to allow the list of functions to vary?是否可以将函数列表传递给 dplyr::summarize 以允许函数列表发生变化? I'd like to create an overall function to create a summary table but allow different for different groups of functions in the output - [edit: when the functions are not all being applied to the same column].我想创建一个整体函数来创建一个汇总表,但允许输出中不同的函数组不同 - [编辑:当函数不是全部应用于同一列时]。

I was thinking this could be done by creating an overall function with which group of summary functions to be included with T/F arguments (where funA=T/F, funB=T/F are lists of functions and the user could include all functions from funA, funB or both), but am not how to write the initial list functions (funA, funB)- when the functions are not all being applied to the same column.我认为这可以通过创建一个整体函数来完成,其中一组汇总函数包含在 T/F 参数中(其中 funA=T/F,funB=T/F 是函数列表,用户可以包含所有函数来自 funA、funB 或两者),但我不知道如何编写初始列表函数 (funA, funB) - 当这些函数并未全部应用于同一列时。 Below is an idea of how it would be structured.下面是它的结构的想法。 Is this possible, or is there a better way to do this?这是可能的,还是有更好的方法来做到这一点?

#Essentially - how would I write a function to selectively include a group of functions (for example either funA = c(n, min, max) or funB=c(n_na, n_neg), or both).  

extract_all <- function(x){

   x %>% summarize(n=n(), 
                   min = min(disp, na.rm=TRUE), 
                   max = max(disp, na.rm=TRUE),
                   n_na = sum(is.na(wt)),  
                   n_neg = sum(vs < 0, na.rm=TRUE))

}
test <- mtcars %>% group_by(cyl) %>% extract_all()

#Does this structure work?
extract_summaries <- function(x, funA=TRUE, funB=FALSE){
  funAls <- list()  #but how do you write n, min, max in here?
  funBls <- list()  #and n_na, n_neg in here

 funls <- append(funAls[funA], funBls[funB])

 summarize(x, funls)
}

#which could be run with:
test <- mtcars %>% group_by(cyl) %>% extract_summaries(funA=TRUE, funB=TRUE)

}

Here is one option这是一种选择

extract_summaries <- function(x, colnm, funA=TRUE, funB=FALSE){
  funAls <- list(n = length, min= min, max = max) 
  funBls <- list(n_na = function(y) sum(is.na(y)), 
              n_neg = function(y) sum(y < 0, na.rm=TRUE)) 
 funls <- append(funAls[funA], funBls[funB])

 x %>% 
      summarise_at(vars({{colnm}}), funls)
}


test <- mtcars %>% 
           group_by(cyl) %>%
           extract_summaries(mpg, funA=TRUE, funB=TRUE)



test
# A tibble: 3 x 6
#    cyl     n   min   max  n_na n_neg
#  <dbl> <int> <dbl> <dbl> <int> <int>
#1     4    11  21.4  33.9     0     0
#2     6     7  17.8  21.4     0     0
#3     8    14  10.4  19.2     0     0

test <- mtcars %>% 
    group_by(cyl) %>% 
    extract_summaries(mpg, funA = FALSE, funB = TRUE)
test
# A tibble: 3 x 3
#    cyl  n_na n_neg
#  <dbl> <int> <int>
#1     4     0     0
#2     6     0     0
#3     8     0     0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM