简体   繁体   English

如何使用多个 OR 语句过滤 R? Dplyr

[英]How to filter in R using multiple OR statments? Dplyr

I tried searching for this but couldn't find what I needed.我尝试搜索此内容,但找不到我需要的内容。

This is how my data looks like,这就是我的数据的样子,

mydata <- data.frame(Chronic = c("Yes", "No", "Yes"),
                      Mental = c("No", "No", "No"),
                      SA = c("No", "No", "Yes"))

> mydata
  Chronic Mental  SA
1     Yes     No  No
2      No     No  No
3     Yes     No Yes

My goal is get the count of rows where any of the column equal Yes.我的目标是获取任何列等于是的行数。 In this case Row 1 & 3 have at least one Yes.在这种情况下,第 1 行和第 3 行至少有一个是。 Where Row 2 only has No其中第 2 行只有 No

Is there an easy to do this?有没有容易做到这一点?

We can use rowSums on a logical matrix and then get the sum of the logical vector to return the count of rows having at least one 'Yes'我们可以在逻辑matrix上使用rowSums ,然后得到逻辑向量的sum ,以返回至少有一个“是”的行数

sum(rowSums(mydata == 'Yes') > 0)
#[1] 2

Or with tidyverse或者使用tidyverse

library(dplyr)
mydata %>% 
   rowwise %>%
   mutate(Count = + any(c_across(everything()) == 'Yes')) %>%
   ungroup %>% 
   pull(Count) %>%
   sum
#[1] 2

If you want to write out the code (as opposed to using across) you can write the code out using case_when:如果你想写出代码(而不是使用cross),你可以使用case_when写出代码:

mydata %>% 
  mutate(yes_column = case_when(Chronic == 'Yes' | Mental == 'Yes' | SA == 'Yes' ~ 1,
                                TRUE ~ 0)) %>% 
  summarise(total = sum(yes_column))

This creates a binary flag if Yes appears in any of the columns.如果 Yes 出现在任何列中,这将创建一个二进制标志。 It's quite useful for seeing the code works ok by each column, particularly to spot if there are data quality problems like 'Yes' or 'yes' or even 'Y'.这对于查看每一列的代码是否正常工作非常有用,特别是发现是否存在诸如“是”或“是”甚至“是”之类的数据质量问题。 The |该| denotes OR and you can use & for AND.表示 OR,您可以使用 & 表示 AND。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM