简体   繁体   English

将列名作为字符串传递给 group_by 并汇总

[英]Pass column names as strings to group_by and summarize

With dplyr starting version 0.7 the methods ending with underscore such as summarize_ group_by_ are deprecated since we are supposed to use quosures.随着 dplyr 从 0.7 版本开始,以下划线结尾的方法(例如 summarise_group_by_)已被弃用,因为我们应该使用 quosures。

See: https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html请参阅: https : //cran.r-project.org/web/packages/dplyr/vignettes/programming.html

I am trying to implement the following example using quo and !!我正在尝试使用 quo 和 !!

Working example:工作示例:

df <- data.frame(x = c("a","a","a","b","b","b"), y=c(1,1,2,2,3,3), z = 1:6)

lFG <- df %>% 
   group_by( x,y) 
lFG %>% summarize( min(z))

However, in the case, I need to implement the columns to group by and summarize are specified as strings.但是,在这种情况下,我需要实现要分组和汇总的列被指定为字符串。

cols2group <- c("x","y")
col2summarize <- "z"

How can I get the same example as above working?我怎样才能得到与上面相同的例子?

For this you can now use _at versions of the verbs为此,您现在可以使用动词的_at版本

df %>%  
  group_by_at(cols2group) %>% 
  summarize_at(.vars = col2summarize, .funs = min)

Edit (2021-06-09): Please see Ronak Shah's answer, using muate(across(all_of(cols2summarize), min)) is now the preferred option编辑 (2021-06-09):请参阅 Ronak Shah 的回答,使用muate(across(all_of(cols2summarize), min))现在是首选选项

From dplyr 1.0.0 you can use across :dplyr 1.0.0您可以用across

library(dplyr)

cols2group <- c("x","y")
col2summarize <- "z"

df %>%
  group_by(across(all_of(cols2group))) %>%
  summarise(across(all_of(col2summarize), min)) %>%
  ungroup

#   x       y     z
#  <chr> <dbl> <int>
#1 a         1     1
#2 a         2     3
#3 b         2     4
#4 b         3     5

Another option is to use non-standard evaluation (NSE), and have R interpret the string as quoted names of objects:另一种选择是使用非标准评估 (NSE),并让 R 将字符串解释为对象的引用名称:

cols2group <- c("x","y")
col2summarize <- "z"

df %>%  
  group_by(!!rlang::sym(cols2group)) %>% 
  summarize(min(!!rlang::sym(col2summarize)))

The rlang::sym() function takes the strings and turns them into quotes, which are in turn unquoted by !! rlang::sym()函数接受字符串并将它们转换为引号,而引号又被!!引号!! and used as names in the context of df where they refer to the relevant columns.并用作df上下文中的名称,它们指的是相关列。 There's different ways of doing the same thing, as always, and this is the shorthand I tend to use!做同一件事有不同的方法,一如既往,这是我倾向于使用的速记!

请参阅 ?dplyr::across 以了解执行此操作的更新方法,因为 group_by_at 和 summary_at 现在已被取代

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM