简体   繁体   English

在所有其他列中的行均为NA的条件下过滤一列的行,并重复n列

[英]Filter rows of one column on the condition that rows in all other columns are NA, and repeat for n columns

Apologies if this has been asked before;抱歉,如果之前有人问过这个问题; I wasn't quite sure how to phrase the question, which might have prevented other questions from showing up in my search.我不太确定如何表达这个问题,这可能会阻止其他问题出现在我的搜索中。

My situation is that I have a data set like this:我的情况是我有一个这样的数据集:

toy <- 
  data.frame(
    Serves_1 = c("yes", NA, "yes", "no", "yes", "no"),
    Serves_2 = c(NA, NA, "no", "no", "no", "yes"),
    Serves_3 = c(NA, "no", "yes", "no", NA, "no"),
    Serves_4 = c(NA, "yes", "yes", "no", "yes", "no")
  )
toy

I'm trying to determine how many rows have a non-NA for one column and NAs for all other columns.我正在尝试确定有多少行对于一列具有非 NA,对于所有其他列具有 NA。 So take for example, column Serves_1:以 Serves_1 列为例:

toy %>%
  filter(
    !is.na(Serves_1) &
      is.na(Serves_2) &
      is.na(Serves_3) &
      is.na(Serves_4)
  ) %>%
  nrow

There is one row where Serves_1 has a non-NA value and, simultaneously, all other columns have NA for that row.有一行 Serves_1 具有非 NA 值,同时,所有其他列都具有该行的 NA。

This code works fine, but I need to repeat this procedure for each column.此代码工作正常,但我需要为每一列重复此过程。 I could just move the exclamation mark down the line for each column.我可以将每列的感叹号向下移动。 But in my real dataset, I have to do this for over 20 columns.但在我的真实数据集中,我必须对 20 多列执行此操作。

Is there a more efficient way to do this (preferably using dplyr)?有没有更有效的方法来做到这一点(最好使用 dplyr)?

You can use rowSums :您可以使用rowSums

library(dplyr)
toy <- 
  data.frame(
    Serves_1 = c("yes", NA, "yes", "no", "yes", "no"),
    Serves_2 = c(NA, NA, "no", "no", "no", "yes"),
    Serves_3 = c(NA, "no", "yes", "no", NA, "no"),
    Serves_4 = c(NA, "yes", "yes", "no", "yes", "no")
  ) %>% 
  mutate(na_sum = rowSums(is.na(.)))

This gives you:这给了你:

  Serves_1 Serves_2 Serves_3 Serves_4 na_sum
1      yes     <NA>     <NA>     <NA>      3
2     <NA>     <NA>       no      yes      2
3      yes       no      yes      yes      0
4       no       no       no       no      0
5      yes       no     <NA>      yes      1
6       no      yes       no       no      0

You can then filter rows where na_sum == 3 to get all rows where one value is not NA and the rest are:然后,您可以过滤na_sum == 3 的行以获取一个值不是 NA 的所有行,并且 rest 是:

toy %>% 
  filter(na_sum ==3)

Which gives us:这给了我们:

  Serves_1 Serves_2 Serves_3 Serves_4 na_sum
1      yes     <NA>     <NA>     <NA>      3

additional option附加选项

sum(apply(toy, 1, function(x) (length(x) - 1 == sum(is.na(x)))))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Select 行,其中只有一列有值,所有其他列都有 NA - Select rows where only one column has a value and all other columns have NA 如果它们不是NA,则将一列粘贴到所有其他列 - Paste one column to all other columns if they are not NA R 过滤行,使得一列以另外两列为条件 - R filter rows such that one column is conditional on two other columns 删除除一列之外的所有列都具有 NA 值的行? - Remove rows where all columns except one have NA values? dplyr 为所有具有其他列的唯一组合的行过滤值为 0 的列 - dplyr filter columns with value 0 for all rows with unique combinations of other columns 如果所有列都包含 NA,则删除行中的 NA 值,但保留至少包含一个结果的行 - dropping NA values in rows if all columns contain NA's but keep the rows which contain at least one result 如果列列表完全充满 NA,如何删除行(不删除至少具有一个非 NA 值的行) - How to remove rows if a list of columns is completely full of NAs (not removing the rows with at least one value other than NA) na.omits删除所有列中具有NA的所有行,而不仅限于指定的列 - na.omits removes all rows with NA in any column and not only on the specified columns 当满足其他列的条件时返回所有行 [R] - Return all rows when a condition is met for other columns [R] 所有其他列满足条件的子集行 R - Subset rows where all other columns meet a condition R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM