将列名作为字符串传递给 group_by 并汇总

Question

With dplyr starting version 0.7 the methods ending with underscore such as summarize_ group_by_ are deprecated since we are supposed to use quosures.随着 dplyr 从 0.7 版本开始，以下划线结尾的方法（例如 summarise_group_by_）已被弃用，因为我们应该使用 quosures。

See: https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html请参阅： https : //cran.r-project.org/web/packages/dplyr/vignettes/programming.html

I am trying to implement the following example using quo and !!我正在尝试使用 quo 和 !!

Working example:工作示例：

df <- data.frame(x = c("a","a","a","b","b","b"), y=c(1,1,2,2,3,3), z = 1:6)

lFG <- df %>% 
   group_by( x,y) 
lFG %>% summarize( min(z))

However, in the case, I need to implement the columns to group by and summarize are specified as strings.但是，在这种情况下，我需要实现要分组和汇总的列被指定为字符串。

cols2group <- c("x","y")
col2summarize <- "z"

How can I get the same example as above working?我怎样才能得到与上面相同的例子？

Answer 1

For this you can now use _at versions of the verbs为此，您现在可以使用动词的_at版本

df %>%  
  group_by_at(cols2group) %>% 
  summarize_at(.vars = col2summarize, .funs = min)

Edit (2021-06-09): Please see Ronak Shah's answer, using muate(across(all_of(cols2summarize), min)) is now the preferred option编辑 (2021-06-09)：请参阅 Ronak Shah 的回答，使用muate(across(all_of(cols2summarize), min))现在是首选选项

Answer 2

From dplyr 1.0.0 you can use across :从dplyr 1.0.0您可以用across ：

library(dplyr)

cols2group <- c("x","y")
col2summarize <- "z"

df %>%
  group_by(across(all_of(cols2group))) %>%
  summarise(across(all_of(col2summarize), min)) %>%
  ungroup

#   x       y     z
#  <chr> <dbl> <int>
#1 a         1     1
#2 a         2     3
#3 b         2     4
#4 b         3     5

Answer 3

Another option is to use non-standard evaluation (NSE), and have R interpret the string as quoted names of objects:另一种选择是使用非标准评估 (NSE)，并让 R 将字符串解释为对象的引用名称：

cols2group <- c("x","y")
col2summarize <- "z"

df %>%  
  group_by(!!rlang::sym(cols2group)) %>% 
  summarize(min(!!rlang::sym(col2summarize)))

The rlang::sym() function takes the strings and turns them into quotes, which are in turn unquoted by !! rlang::sym()函数接受字符串并将它们转换为引号，而引号又被!!引号!! and used as names in the context of df where they refer to the relevant columns.并用作df上下文中的名称，它们指的是相关列。 There's different ways of doing the same thing, as always, and this is the shorthand I tend to use!做同一件事有不同的方法，一如既往，这是我倾向于使用的速记！

Answer 4

请参阅 ?dplyr::across 以了解执行此操作的更新方法，因为 group_by_at 和 summary_at 现在已被取代

将列名作为字符串传递给 group_by 并汇总

问题描述

4 个解决方案

解决方案1
12 已采纳 2017-10-24 19:51:42

解决方案2
4 2021-04-02 03:14:40

解决方案3
2 2020-11-18 19:12:47

解决方案4
1 2020-12-20 14:42:45

将列名作为字符串传递给 group_by 并汇总

问题描述

4 个解决方案

解决方案1 12 已采纳 2017-10-24 19:51:42

解决方案2 4 2021-04-02 03:14:40

解决方案3 2 2020-11-18 19:12:47

解决方案4 1 2020-12-20 14:42:45

解决方案1
12 已采纳 2017-10-24 19:51:42

解决方案2
4 2021-04-02 03:14:40

解决方案3
2 2020-11-18 19:12:47

解决方案4
1 2020-12-20 14:42:45