[英]Filter rows of one column on the condition that rows in all other columns are NA, and repeat for n columns
Apologies if this has been asked before;抱歉,如果之前有人问过这个问题; I wasn't quite sure how to phrase the question, which might have prevented other questions from showing up in my search.
我不太确定如何表达这个问题,这可能会阻止其他问题出现在我的搜索中。
My situation is that I have a data set like this:我的情况是我有一个这样的数据集:
toy <-
data.frame(
Serves_1 = c("yes", NA, "yes", "no", "yes", "no"),
Serves_2 = c(NA, NA, "no", "no", "no", "yes"),
Serves_3 = c(NA, "no", "yes", "no", NA, "no"),
Serves_4 = c(NA, "yes", "yes", "no", "yes", "no")
)
toy
I'm trying to determine how many rows have a non-NA for one column and NAs for all other columns.我正在尝试确定有多少行对于一列具有非 NA,对于所有其他列具有 NA。 So take for example, column Serves_1:
以 Serves_1 列为例:
toy %>%
filter(
!is.na(Serves_1) &
is.na(Serves_2) &
is.na(Serves_3) &
is.na(Serves_4)
) %>%
nrow
There is one row where Serves_1 has a non-NA value and, simultaneously, all other columns have NA for that row.有一行 Serves_1 具有非 NA 值,同时,所有其他列都具有该行的 NA。
This code works fine, but I need to repeat this procedure for each column.此代码工作正常,但我需要为每一列重复此过程。 I could just move the exclamation mark down the line for each column.
我可以将每列的感叹号向下移动。 But in my real dataset, I have to do this for over 20 columns.
但在我的真实数据集中,我必须对 20 多列执行此操作。
Is there a more efficient way to do this (preferably using dplyr)?有没有更有效的方法来做到这一点(最好使用 dplyr)?
You can use rowSums
:您可以使用
rowSums
:
library(dplyr)
toy <-
data.frame(
Serves_1 = c("yes", NA, "yes", "no", "yes", "no"),
Serves_2 = c(NA, NA, "no", "no", "no", "yes"),
Serves_3 = c(NA, "no", "yes", "no", NA, "no"),
Serves_4 = c(NA, "yes", "yes", "no", "yes", "no")
) %>%
mutate(na_sum = rowSums(is.na(.)))
This gives you:这给了你:
Serves_1 Serves_2 Serves_3 Serves_4 na_sum
1 yes <NA> <NA> <NA> 3
2 <NA> <NA> no yes 2
3 yes no yes yes 0
4 no no no no 0
5 yes no <NA> yes 1
6 no yes no no 0
You can then filter rows where na_sum
== 3 to get all rows where one value is not NA and the rest are:然后,您可以过滤
na_sum
== 3 的行以获取一个值不是 NA 的所有行,并且 rest 是:
toy %>%
filter(na_sum ==3)
Which gives us:这给了我们:
Serves_1 Serves_2 Serves_3 Serves_4 na_sum
1 yes <NA> <NA> <NA> 3
additional option附加选项
sum(apply(toy, 1, function(x) (length(x) - 1 == sum(is.na(x)))))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.