如何在 R dplyr 过滤器中实现条件过滤器

Question

I have the following data with two columns and 15 rows:我有以下两列和 15 行的数据：

data_1 <- structure(list(column_1 = c(120, 130, NA, NA, NA, 130, 182, 130, 
NA, 925, NA, 181, 182, 188, NA), column_2 = c(7, NA, 1, 1, 1, 
3, 7, NA, 1, NA, 1, NA, 1, 1, 1)), row.names = c(NA, -15L), class = c("tbl_df", 
"tbl", "data.frame"))

	column_1列_1	column_2 column_2
1 1	120 120	7 7
2 2	130 130	NA不适用
3 3	NA不适用	1 1
4 4	NA不适用	1 1
5 5	NA不适用	1 1
6 6	130 130	3 3
7 7	182 182	7 7
8 8	130 130	NA不适用
9 9	NA不适用	1 1
10 10	925 925	NA不适用
11 11	NA不适用	1 1
12 12	181 181	NA不适用
13 13	182 182	1 1
14 14	188 188	1 1
15 15	NA不适用	1 1

By using filters, I would like to keep the oberservations with the following values in column_1: NA , 130, 181, 182, 188通过使用过滤器，我想在 column_1 中保留以下值的观察结果： NA , 130, 181, 182, 188
Furthermore, I would like to remove all observations with the entry 7 in column_2此外，我想删除 column_2 中条目 7 的所有观察结果

So far, this works by the following code:到目前为止，这通过以下代码起作用：

data_1 %>% filter(is.na(column_1) | column_1 %in% c(130, 181, 182, 188), !column_2 %in% 7)

Now I want to add an additional filter: If the value is 130 in column_1 and in column_2 it is a NA , then remove the oberservation (so the rows 2 and 8 in data_1).现在我想添加一个额外的过滤器：如果 column_1 中的值为 130 并且 column_2 中的值为NA ，则删除观察值（因此 data_1 中的第 2 行和第 8 行）。 How could I do this?我怎么能这样做？ What are the best ways to achieve this conditional filter?实现此条件过滤器的最佳方法是什么？ I have tried the following commands so far, which do not lead to the desired result:到目前为止，我已经尝试了以下命令，但都没有达到预期的结果：

data_1 %>% filter(is.na(column_1) | column_1 %in% c(130, 181, 182, 188), !column_2 %in% 7) %>% filter(case_when(column_1 == 130 ~ !is.na(column_2)))

The result here is that only the entry 130, 3 is kept.此处的结果是仅保留条目 130、3。

data_1 %>% filter(is.na(column_1) | column_1 %in% c(130, 181, 182, 188), !column_2 %in% 7) %>% filter(case_when(column_1 == 130 ~ !is.na(column_2), TRUE ~ is.na(column_2)))

Now two entries remain: 130, 3 and 181, NA现在剩下两个条目：130、3 和 181， NA

I have also tried the following two commands:我还尝试了以下两个命令：

data_1 %>% filter(is.na(column_1) | column_1 %in% c(130, 181, 182, 188), !column_2 %in% 7) %>% filter(if (column_2 == 130) !is.na(column_2))
data_1 %>% filter(is.na(column_1) | column_1 %in% c(130, 181, 182, 188), !column_2 %in% 7) %>% {if (column_2 == 130) filter(., !is.na(column_2))}

Answer 1

Are you looking for something like this?你在寻找这样的东西吗？

library(tidyverse)


data_1 |>
  filter(case_when(
    is.na(column_1) ~ T,
    column_1 == 130 & is.na(column_2 ) ~ F,
    column_2 == 7 ~ F,
    column_1 %in% c(130, 181, 182, 188) ~ T,
    T ~ F
  ))
#> # A tibble: 10 x 2
#>    column_1 column_2
#>       <dbl>    <dbl>
#>  1       NA        1
#>  2       NA        1
#>  3       NA        1
#>  4      130        3
#>  5       NA        1
#>  6       NA        1
#>  7      181       NA
#>  8      182        1
#>  9      188        1
#> 10       NA        1

I just added all of your conditions to one big case_when .我刚刚将您的所有条件添加到一个大case_when中。 Make sure to map the statements to T and F so that the filter works correctly.确保将语句映射到T和F ，以便过滤器正常工作。 In this case, when the condition is mapped to T you will keep the row and when it is F you will remove the row.在这种情况下，当条件映射到T时，您将保留该行，当它为F时，您将删除该行。

Answer 2

I would only add that structure(list()) may be needlessly high level here unless it is done for another reason.我只会补充说结构（列表（））可能在这里不必要的高级，除非它是出于其他原因完成的。 Simpler would be:更简单的是：

data.frame(column_1 = c(120, 130, NA, NA, NA, 130, 182, 130, NA, 925, NA, 181, 182, 188, NA), 
           column_2 = c(7, NA, 1, 1, 1, 3, 7, NA, 1, NA, 1, NA, 1, 1, 1)))

# or

tibble::tibble(column_1 = c(120, 130, NA, NA, NA, 130, 182, 130, NA, 925, NA, 181, 182, 188, NA), 
               column_2 = c(7, NA, 1, 1, 1, 3, 7, NA, 1, NA, 1, NA, 1, 1, 1))

如何在 R dplyr 过滤器中实现条件过滤器

问题描述

2 个解决方案

解决方案1
1 2022-05-15 13:19:58

解决方案2
0 2022-05-15 13:41:59

如何在 R dplyr 过滤器中实现条件过滤器

问题描述

2 个解决方案

解决方案1 1 2022-05-15 13:19:58

解决方案2 0 2022-05-15 13:41:59

解决方案1
1 2022-05-15 13:19:58

解决方案2
0 2022-05-15 13:41:59