I want to filter data frame according to a specific conditions in several columns.
I use the following example o make it my statement more clear.
I have a data frame:
dat <- data.frame(A = c(122, 122, 122), B = c(0.1, 0.1, 0.1),
C = c(5, 5, 4), D = c(6, 7, 6))
I want to select rows which contains both maximum values in column C and D, my R code is :
select <- dat %>%
group_by(A, B) %>%
filter(C == max(C) , D == max(D))
and I get want I want:
> select
# A tibble: 1 x 4
# Groups: A, B [1]
A B C D
<dbl> <dbl> <dbl> <dbl>
1 122 0.1 5 7
However, I want to use filter_at()
function
select <- dat %>%
group_by(A, B) %>%
filter_at(vars(C, D), all_vars(. max))
It did not work. Thanks a lot for your help.
You can do this:
dat %>%
group_by(A, B) %>%
filter_at(vars(C, D), all_vars(. == max(.)))
The problem before was all_vars()
is expecting to evaluate to a logical. And without an equality operator, ==
, >
, <
, it was throwing an error back at you.
As of dplyr 1.0, there is a new way to select, filter and mutate. This is accomplished with the across
function and certain helper verbs. For this particular case, the filtering could also be accomplished as follows:
dat %>%
group_by(A, B) %>%
filter(across(c(C, D), ~ . == max(.)))
# A tibble: 1 x 4
# Groups: A, B [1]
A B C D
<dbl> <dbl> <dbl> <dbl>
1 122 0.1 5 7
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.