简体   繁体   中英

How to pass a variable name to dplyr's group_by()

I can calculate the rank of the values ( val ) in my dataframe df within the group name1 with the code:

res  <- df %>% arrange(val) %>% group_by(name1) %>% mutate(RANK=row_number()) 

Instead of writing the column "name1" in the code, I want to pass it as variable, eg crit = "name1" . However, the code below does not work since crit1 is assumed to be the column name instead of a variable name.

res  <- df %>% arrange(val) %>% group_by(crit1) %>% mutate(RANK=row_number()) 

How can I pass crit1 in the code?

Thanks.

We can use group_by_

library(dplyr)
df %>%
    arrange(val) %>% 
    group_by_(.dots=crit1) %>%
    mutate(RANK=row_number()) 
#Source: local data frame [10 x 4]
#Groups: name1, name2 [7]

#            val name1 name2  RANK
#          <dbl> <chr> <chr> <int>
#1  -0.848370044     b     c     1
#2  -0.583627199     a     a     1
#3  -0.545880758     a     a     2
#4  -0.466495124     b     b     1
#5   0.002311942     a     c     1
#6   0.266021979     c     a     1
#7   0.419623149     c     b     1
#8   0.444585270     a     c     2
#9   0.536585304     b     a     1
1#0  0.847460017     a     c     3

Update

group_by_ is deprecated in the recent versions (now using dplyr version - 0.8.1 ), so we can use group_by_at which takes a vector of strings as input variables

df %>%
  arrange(val) %>% 
  group_by_at(crit1) %>%
  mutate(RANK=row_number())

Or another option is to convert to symbols ( syms from rlang ) and evaluate ( !!! )

df %>%
   arrange(val) %>% 
   group_by(!!! rlang::syms(crit1)) %>% 
   mutate(RANK = row_number())

data

set.seed(24)
df <- data.frame(val = rnorm(10), name1= sample(letters[1:3], 10, replace=TRUE), 
         name2 = sample(letters[1:3], 10, replace=TRUE), 
 stringsAsFactors=FALSE)

crit1 <- c("name1", "name2")

Update with dplyr 1.0.0

The new across syntax eliminates the need for !!! rlang::syms() !!! rlang::syms() . So you can now simplify the code by:

df %>%
   arrange(val) %>% 
   group_by(across(all_of(crit1))) %>% 
   mutate(RANK = row_number())

Facing a similar task I could successfully work with these two options.

Use across() :

for (crit in names(df)) {
  print(df |> 
          # all_of() is not needed here
          group_by(across(crit)) |> 
          count())
}

Use syms() and !! :

crits = syms(names(df))

for (crit in crits) {
  print(df |> 
          # the use of !! instead of !!! is now encouraged 
          group_by(!!crit) |> 
          count())
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM