[英]Filter dataframe within a group with one column meeting an AND condition in R
I have the following dataframe for which I need to filter only those rows that have both an "intake" and "discharge" per group (id).我有以下 dataframe ,我只需要过滤那些每组(id)同时具有“进气”和“排气”的行。 The result should go from looking like this:
结果应该是 go 看起来像这样:
> df <- tibble(id = c(1, 1, 2, 3, 3, 3, 4, 4, 5, 6, 7, 7),
+ type = c("intake", "discharge", "intake", "intake", "discharge", "other",
+ "intake", "discharge", "intake", "intake", "intake", "discharge"))
> df
id type
<dbl> <chr>
1 1 intake
2 1 discharge
3 2 intake
4 3 intake
5 3 discharge
6 4 intake
7 4 discharge
8 5 intake
9 6 intake
10 7 intake
11 7 discharge
To this:对此:
id type
<dbl> <chr>
1 1 intake
2 1 discharge
3 3 intake
4 3 discharge
5 4 intake
6 4 discharge
7 7 intake
8 7 discharge
So that groups (ids) that do not have both an intake AND a discharge are removed (and only those that do have both are kept).这样就删除了既不具有摄入量又不具有排出量的组(ID)(并且仅保留同时具有两者的组)。
I hope that makes sense... sorry it has been a long day.我希望这是有道理的......对不起,这是漫长的一天。
library(dplyr)
df %>%
group_by(id) %>%
filter(sum(type == "intake") >= 1,
sum(type == "discharge") >= 1) %>%
# add below if we only want intake/discharge lines
# filter(type %in% c("intake", "discharge")) %>%
ungroup()
Result (varies due to addition of "other" in OP, unclear desired behavior)结果(由于在 OP 中添加了“其他”,期望的行为不清楚)
# A tibble: 9 x 2
id type
<dbl> <chr>
1 1 intake
2 1 discharge
3 3 intake
4 3 discharge
5 3 other
6 4 intake
7 4 discharge
8 7 intake
9 7 discharge
Here's a way to select groups that have both "intake"
and "discharge"
.这是 select 组的一种方法,它同时具有
"intake"
和"discharge"
。
library(dplyr)
values <- c('intake', 'discharge')
df %>%
group_by(id) %>%
filter(all(values %in% type) & type %in% values) %>%
ungroup
# id type
# <dbl> <chr>
#1 1 intake
#2 1 discharge
#3 3 intake
#4 3 discharge
#5 4 intake
#6 4 discharge
#7 7 intake
#8 7 discharge
all(values %in% type)
selects the complete group which has both the values whereas type %in% values
would select within those groups rows which has either of the two values. all(values %in% type)
选择具有两个值的完整组,而type %in% values
将 select 在具有两个值之一的组行中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.