[英]Filter data by group & preserve empty groups
I wonder how can I filter
my data by group, and preserve the groups that are empty ?我想知道如何按组
filter
我的数据,并保留空的组?
Example:例子:
year = c(1,2,3,1,2,3,1,2,3)
site = rep(c("a", "b", "d"), each = 3)
value = c(3,3,0,1,8,5,10,18,27)
df <- data.frame(year, site, value)
I want to subset the rows where the value
is more than 5. For some groups, this is never true.我想对
value
大于 5 的行进行子集化。对于某些组,这从来都不是真的。 Filter
function simply skips empty groups. Filter
功能只是跳过空组。
How can I keep my empty groups and have NA instead?如何保留我的空组并改为使用 NA? Ideally, I would like to use
dplyr
funtions instead of base
R.理想情况下,我想使用
dplyr
而不是base
R。
My filtering approach, where .preserve
does not preserve empty groups:我的过滤方法,其中
.preserve
不保留空组:
df %>%
group_by(site) %>%
filter(value > 5, .preserve = TRUE)
Expected output:预期输出:
year site value
<dbl> <fct> <dbl>
1 NA a NA
2 2 b 8
3 1 d 10
4 2 d 18
5 3 d 27
With the addition of tidyr
, you can do:添加
tidyr
,您可以执行以下操作:
df %>%
group_by(site) %>%
filter(value > 5) %>%
ungroup() %>%
complete(site = df$site)
site year value
<fct> <dbl> <dbl>
1 a NA NA
2 b 2 8
3 d 1 10
4 d 2 18
5 d 3 27
Or if you want to keep it in dplyr
:或者,如果您想将其保留在
dplyr
:
df %>%
group_by(site) %>%
filter(value > 5) %>%
bind_rows(df %>%
group_by(site) %>%
filter(all(value <= 5)) %>%
summarise_all(~ NA))
Using the nesting functionality of tidyr
and applying purrr::map
使用
tidyr
的嵌套功能并应用purrr::map
df %>%
group_by(site) %>%
tidyr::nest() %>%
mutate(data = purrr::map(data, . %>% filter(value > 5))) %>%
tidyr::unnest(cols=c(data), keep_empty = TRUE)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.