[英]dplyr filter based on conditions across and within column
I'd like to validate survey responses, involving removing rows with NAs based on the condition within a column and across columns.我想验证调查响应,包括根据列内和跨列的条件删除带有 NA 的行。 Sample dataset below
下面的示例数据集
col1 <- c("Yes", "Yes", "No", "No", NA)
col2 <- c("Yes", NA, "No", NA, NA)
col3 <- c("No", "Yes", "No", NA, NA)
dataset <- data.frame(col1, col2, col3)
dataset
The desired output involves filtering out all rows in col1, and then removing only the row with a Yes in col1 and NA in any other column.所需的输出涉及过滤掉 col1 中的所有行,然后仅删除 col1 中为 Yes 且任何其他列中为 NA 的行。 Desired output below `
所需的输出低于`
col1 col2 col3
1 Yes Yes No
2 No No No
3 No <NA> <NA>
` I've tried basic filtering operations like ` 我试过基本的过滤操作,比如
dataset %>% filter(col1 == "Yes" | !is.na(.))
with other operators such as '&, |'与其他运算符,如“&、|” but with no luck and I'm not sure how to apply across or filter_if here to make it work.
但没有运气,我不确定如何在此处应用 across 或 filter_if 以使其工作。 I recognize this is very similar to https://stackoverflow.com/questions/43938863/dplyr-filter-with-condition-on-multiple-columns , but different enough to warrant asking this question again.
我认识到这与https://stackoverflow.com/questions/43938863/dplyr-filter-with-condition-on-multiple-columns非常相似,但不同之处足以保证再次问这个问题。
What am I missing here?我在这里错过了什么?
Your logic is encapsulated with:您的逻辑封装有:
dataset %>%
filter(!(is.na(col1) | (col1 == "Yes" & (is.na(col2) | is.na(col3)))))
#> col1 col2 col3
#> 1 Yes Yes No
#> 2 No No No
#> 3 No <NA> <NA>
We can rewrite this with indentations and comments to make the logic clearer:我们可以用缩进和注释重写它,使逻辑更清晰:
dataset %>%
filter(!( # Remove any of the following cases:
is.na(col1) # Column 1 is missing
| # OR
(col1 == "Yes" # col1 is yes
& # AND
(is.na(col2) | is.na(col3)) # Either col2 OR col3 are missing
)
))
#> col1 col2 col3
#> 1 Yes Yes No
#> 2 No No No
#> 3 No <NA> <NA>
You can use if_any
to deal with the second filtering condition:可以使用
if_any
来处理第二个过滤条件:
dataset %>%
filter(complete.cases(col1),
!(col1 == "Yes" & if_any(-col1, is.na)))
col1 col2 col3
1 Yes Yes No
2 No No No
3 No <NA> <NA>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.