简体   繁体   中英

using dplyr filter_at() function to select rows with conditions

I want to filter data frame according to a specific conditions in several columns.

I use the following example o make it my statement more clear.

I have a data frame:

dat <- data.frame(A = c(122, 122, 122), B = c(0.1, 0.1, 0.1), 
                  C = c(5, 5, 4), D = c(6, 7, 6))

I want to select rows which contains both maximum values in column C and D, my R code is :

select <- dat %>%
          group_by(A, B) %>%
          filter(C == max(C) , D == max(D))

and I get want I want:

> select
# A tibble: 1 x 4
# Groups:   A, B [1]
     A     B     C     D
   <dbl> <dbl> <dbl> <dbl>
1   122   0.1     5     7 

However, I want to use filter_at() function

select <- dat %>%
          group_by(A, B) %>%
          filter_at(vars(C, D), all_vars(. max))

It did not work. Thanks a lot for your help.

You can do this:

dat %>%
    group_by(A, B) %>%
    filter_at(vars(C, D), all_vars(. == max(.)))

The problem before was all_vars() is expecting to evaluate to a logical. And without an equality operator, == , > , < , it was throwing an error back at you.

As of dplyr 1.0, there is a new way to select, filter and mutate. This is accomplished with the across function and certain helper verbs. For this particular case, the filtering could also be accomplished as follows:

dat %>%
group_by(A, B) %>%
filter(across(c(C, D), ~ . == max(.)))

# A tibble: 1 x 4
# Groups:   A, B [1]
      A     B     C     D
  <dbl> <dbl> <dbl> <dbl>
1   122   0.1     5     7

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM