[英]Different filter rules for groups using dplyr
Sample data: 样本数据:
df <- data.frame(loc.id = rep(1:2, each = 11),
x = c(35,51,68,79,86,90,92,93,95,98,100,35,51,68,79,86,90,92,92,93,94,94))
For each loc.id
, I want to filter filter out x <= 95
. 对于每个
loc.id
,我要过滤掉x <= 95
。
df %>% group_by(loc.id) %>% filter(row_number() <= which.max(x >= 95))
loc.id x
<int> <dbl>
1 1 35
2 1 51
3 1 68
4 1 79
5 1 86
6 1 90
7 1 92
8 1 93
9 1 95
10 2 35
However, the issue for group 2 all the values are less than 95. Therefore I want to keep all values of x
for group 2. However, the above line does not do it. 但是,第2组所有值的问题都小于95。因此,我想保留第2组
x
所有值。但是,上面的行没有这样做。
Perhaps something like this? 也许像这样?
df %>%
group_by(loc.id) %>%
mutate(n = sum(x > 95)) %>%
filter(n == 0 | (x > 0 & x > 95)) %>%
ungroup() %>%
select(-n)
## A tibble: 13 x 2
# loc.id x
# <int> <dbl>
# 1 1 98.
# 2 1 100.
# 3 2 35.
# 4 2 51.
# 5 2 68.
# 6 2 79.
# 7 2 86.
# 8 2 90.
# 9 2 92.
#10 2 92.
#11 2 93.
#12 2 94.
#13 2 94.
Note that removing entries where x <= 95
corresponds to retaining entries where x > 95
(not x >= 95
). 请注意, 删除
x <= 95
条目对应于保留 x > 95
条目(不是x >= 95
)。
You can use match
to get the first TRUE
index and return the length of group if no match is found via the nomatch
parameter: 如果没有通过
nomatch
参数找到匹配项,则可以使用match
获取第一个TRUE
索引并返回组的长度:
df %>%
group_by(loc.id) %>%
filter(row_number() <= match(TRUE, x >= 95, nomatch=n()))
# A tibble: 20 x 2
# Groups: loc.id [2]
# loc.id x
# <int> <dbl>
# 1 1 35
# 2 1 51
# 3 1 68
# 4 1 79
# 5 1 86
# 6 1 90
# 7 1 92
# 8 1 93
# 9 1 95
#10 2 35
#11 2 51
#12 2 68
#13 2 79
#14 2 86
#15 2 90
#16 2 92
#17 2 92
#18 2 93
#19 2 94
#20 2 94
Or reverse cumsum
as filter condition: 或将
cumsum
取反作为过滤条件:
df %>% group_by(loc.id) %>% filter(!lag(cumsum(x >= 95), default=FALSE))
A solution using all
along with dplyr
package can be achieved as: 使用的溶液
all
连同dplyr
封装能够被实现为:
library(dplyr)
df %>% group_by(loc.id) %>%
filter((x > 95) | all(x<=95)) # All x in group are <= 95 OR x > 95
# # Groups: loc.id [2]
# loc.id x
# <int> <dbl>
# 1 1 98.0
# 2 1 100
# 3 2 35.0
# 4 2 51.0
# 5 2 68.0
# 6 2 79.0
# 7 2 86.0
# 8 2 90.0
# 9 2 92.0
# 10 2 92.0
# 11 2 93.0
# 12 2 94.0
# 13 2 94.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.