[英]Filter rows that contain specific boolean value in any column in a dataframe in R
Let's say I have a data frame:假设我有一个数据框:
data <- data.frame(w = c(1, 2, 3, 4), x = c(F, F, F, F), y = c(T, T, F, T),
z = c(T, F, F, T), z1 = c(12, 4, 5, 15))
data
#> w x y z z1
#> 1 1 FALSE TRUE TRUE 12
#> 2 2 FALSE TRUE FALSE 4
#> 3 3 FALSE FALSE FALSE 5
#> 4 4 FALSE TRUE TRUE 15
Question问题
How do I filter the rows in which all boolean variables are FALSE
?如何过滤所有 boolean 变量为FALSE
的行? In this case, row 3
.在这种情况下, row 3
。 Or in other words, I would like to get a data frame that has at least one TRUE
value per row.或者换句话说,我想得到一个每行至少有一个TRUE
值的数据框。
Expected output预期 output
#> w x y z z1
#> 1 1 FALSE TRUE TRUE 12
#> 2 2 FALSE TRUE FALSE 4
#> 3 4 FALSE TRUE TRUE 15
Attempt试图
library(tidyverse)
data %>% filter(x == T | y == T | z == T)
#> w x y z z1
#> 1 1 FALSE TRUE TRUE 12
#> 2 2 FALSE TRUE FALSE 4
#> 3 4 FALSE TRUE TRUE 15
Above is a working option, but not scalable at all.以上是一个可行的选择,但根本不可扩展。 Is there a more convenient option using the dplyr's filter()
function?使用dplyr's filter()
function 是否有更方便的选择?
rowSums()
is a good option - TRUE is 1, FALSE is 0. rowSums()
是一个不错的选择 - TRUE 为 1,FALSE 为 0。
cols = c("x", "y", "z")
## all FALSE
df[rowSums[cols] == 0, ]
## at least 1 TRUE
df[rowSums[cols] >= 1, ]
## etc.
With dplyr
, I would use the same idea like this:使用dplyr
,我会使用这样的相同想法:
df %>%
filter(
rowSums(. %>% select(all_of(cols))) >= 1
)
# after @Gregor Thomas's suggestion on using TRUE or FALSE
df[!(apply(!df[, c('x', 'y', 'z')], 1, all)), ]
# without rowSums
df[!(apply(df[, c('x', 'y', 'z')] == FALSE, 1, all)), ]
# with rowSums
df[rowSums(df[, c('x', 'y', 'z')] == FALSE) != 3, ]
# w x y z z1
#1 1 FALSE TRUE TRUE 12
#2 2 FALSE TRUE FALSE 4
#4 4 FALSE TRUE TRUE 15
With dplyr's filter()
,使用 dplyr 的filter()
,
library(dplyr)
filter(data, (x + y + z) > 0 )
w x y z z1
1 1 FALSE TRUE TRUE 12
2 2 FALSE TRUE FALSE 4
3 4 FALSE TRUE TRUE 15
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.