R：为一组列返回只有 1 个非 NA 值的行

Question

Suppose I have a data.table with the following data:假设我有一个包含以下数据的 data.table：

colA  colB  colC  result
1     2     3     231
1     NA    2     123
NA    3     NA    345
11    NA    NA    754

How would I use dplyr and magrittr to only select the following rows:我将如何使用dplyr和magrittr只选择以下行：

colA  colB  colC result
NA    3     NA   345
11    NA    NA   754

The selection criteria is: only 1 non-NA value for columns AC (ie colA, colB, ColC )选择标准是：AC 列只有 1 个非 NA 值（即colA, colB, ColC ）

I have been unable to find a similar question;我一直找不到类似的问题； guessing this is an odd situation.猜测这是一个奇怪的情况。

Answer 1

A base R option would be一个基本的 R 选项是

df[apply(df, 1, function(x) sum(!is.na(x)) == 1), ]
#  colA colB colC
#3   NA    3   NA
#4   11   NA   NA

A dplyr option is dplyr选项是

df %>% filter(rowSums(!is.na(.)) == 1)

Update更新

In response to your comment, you can do为了回应你的评论，你可以做

df[apply(df[, -ncol(df)], 1, function(x) sum(!is.na(x)) == 1), ]
#  colA colB colC result
#3   NA    3   NA    345
#4   11   NA   NA    754

Or the same in dplyr或在dplyr相同

df %>% filter(rowSums(!is.na(.[-length(.)])) == 1)

This assumes that the last column is the one you'd like to ignore.这假设最后一列是您要忽略的列。

Sample data样本数据

df <-read.table(text = "colA  colB  colC
1     2     3
1     NA    2
NA    3     NA
11    NA    NA", header = T)

Sample data for update用于更新的示例数据

df <- read.table(text =
"colA  colB  colC  result
1     2     3     231
1     NA    2     123
NA    3     NA    345
11    NA    NA    754
", header = T)

Answer 2

Another option is filter with map另一种选择是用map filter

library(dplyr)
library(purrr)
df %>% 
    filter(map(select(., starts_with('col')), ~ !is.na(.)) %>% 
              reduce(`+`) == 1)
#    colA colB colC result
#1   NA    3   NA    345
#2   11   NA   NA    754

Or another option is to use transmute_at或者另一种选择是使用transmute_at

df %>% 
   transmute_at(vars(starts_with('col')), ~ !is.na(.)) %>% 
   reduce(`+`) %>%
   magrittr::equals(1) %>% filter(df, .)
#  colA colB colC result
#1   NA    3   NA    345
#2   11   NA   NA    754

data数据

df <- structure(list(colA = c(1L, 1L, NA, 11L), colB = c(2L, NA, 3L, 
NA), colC = c(3L, 2L, NA, NA), result = c(231L, 123L, 345L, 754L
)), class = "data.frame", row.names = c(NA, -4L))

Answer 3

I think this would be possible with filter_at but I was not able to make it work.我认为这可以通过filter_at实现，但我无法使其工作。 Here is one attempt with filter and pmap_lgl where you can specify the range of columns in select or specify by their positions or use other tidyselect helper variables.这是使用filter和pmap_lgl一种尝试，您可以在其中指定select的列范围或通过它们的位置指定或使用其他 tidyselect 辅助变量。

library(dplyr)
library(purrr)

df %>%
  filter(pmap_lgl(select(., colA:colC), ~sum(!is.na(c(...))) == 1))

 #  colA colB colC result
#1   NA    3   NA    345
#2   11   NA   NA    754

data数据

df <- structure(list(colA = c(1L, 1L, NA, 11L), colB = c(2L, NA, 3L, 
NA), colC = c(3L, 2L, NA, NA), result = c(231L, 123L, 345L, 754L
)), class = "data.frame", row.names = c(NA, -4L))

R：为一组列返回只有 1 个非 NA 值的行

问题描述

3 个解决方案

解决方案1
4 已采纳 2020-01-16 02:25:12

Update更新

Sample data样本数据

Sample data for update用于更新的示例数据

解决方案2
1 2020-01-16 19:41:13

data数据

解决方案3
0 2020-01-16 02:48:06

R：为一组列返回只有 1 个非 NA 值的行

问题描述

3 个解决方案

解决方案1 4 已采纳 2020-01-16 02:25:12

Update更新

Sample data样本数据

Sample data for update用于更新的示例数据

解决方案2 1 2020-01-16 19:41:13

data数据

解决方案3 0 2020-01-16 02:48:06

解决方案1
4 已采纳 2020-01-16 02:25:12

解决方案2
1 2020-01-16 19:41:13

解决方案3
0 2020-01-16 02:48:06