简体   繁体   中英

How can I use dplyr across() programmatically on no variables?

Issue:

I want to use across() programmatically so that if, eg NULL or an empty string is passed to it, the function won't fail. This is possibly using scoped variants of functions such as group_by_at() , but I'd like to make it work neatly (ie without if-statements) using across() .

Note also that currently across() will affect all columns if left empty. I'm unsure what the motivation for this is; to me it would make more sense if no columns were affected.

Example

Here's a quick example using functions to calculate the mean of a variable y . Passing a grouping variable works with group_by_at() , but not with across() as shown:

my_df <- tibble("x" = c("a", "a", "b", "b"), y = 1:4)

compute_mean1 <- function(df, grouping) { # compute grouped mean with across()
  df %>% 
    group_by(across(all_of(grouping))) %>% 
    summarise(y = mean(y), .groups = "drop")
}

compute_mean2 <- function(df, grouping) { # compute grouped mean with group_by_at()
  df %>% 
    group_by_at(grouping) %>% 
    summarise(y = mean(y), .groups = "drop")
}


compute_mean1(my_df, "x")
#> # A tibble: 2 x 2
#>   x         y
#>   <chr> <dbl>
#> 1 a       1.5
#> 2 b       3.5
compute_mean1(my_df, NULL)
#> Error: `vars` must be a character vector.
compute_mean2(my_df, "x")
#> # A tibble: 2 x 2
#>   x         y
#>   <chr> <dbl>
#> 1 a       1.5
#> 2 b       3.5
compute_mean2(my_df, NULL)
#> # A tibble: 1 x 1
#>       y
#>   <dbl>
#> 1   2.5

Created on 2020-07-14 by the reprex package (v0.3.0)

Use .add=TRUE like this:

compute_mean3 <- function(df, grouping) { # compute grouped mean with across()
  df %>% 
    group_by(across(all_of(grouping)), .add = TRUE) %>%
    summarise(y = mean(y), .groups = "drop")
}
 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM