为完整案例设置阈值以从 R 中的多个列中删除 NA

Question

There might be an easy answer to this, but I am not able to make it work.对此可能有一个简单的答案，但我无法使其发挥作用。 I have a data table that looks like this:我有一个如下所示的数据表：

df <- data.table(t = c(1, 2, 3), a = c(NA, NA, 4), b = c(NA, 4, NA), c = c(NA, 4, NA))

How can I remove only the rows where all columns but "t" have NA's.如何仅删除除“t”之外的所有列都具有 NA 的行。 It should be fast because of my big data files, so I would like to do it especially with complete.cases.由于我的数据文件很大，所以它应该很快，所以我特别想用 complete.cases 来做。 I couldn't find a solution to this problem yet.我还没有找到解决这个问题的方法。

The result should look like this结果应该是这样的

dfRes <- data.table(t = c(2, 3), a = c(NA, 4), b = c(4, NA), c = c(4, NA))

Answer 1

We can use rowSums on columns other than "t" .我们可以在"t"以外的列上使用rowSums 。

library(data.table)

cols <- which(names(df) != 't')
df[rowSums(!is.na(df[, ..cols])) > 0, ]

#   t  a  b  c
#1: 2 NA  4  4
#2: 3  4 NA NA

Answer 2

We can use complete.cases with Reduce我们可以使用complete.cases和Reduce

library(data.table)
df[df[, Reduce(`|`, lapply(.SD, complete.cases)), .SDcols = a:c]]
#   t  a  b  c
#1: 2 NA  4  4
#2: 3  4 NA NA

为完整案例设置阈值以从 R 中的多个列中删除 NA

问题描述

2 个解决方案

解决方案1
1 2020-05-15 10:01:41

解决方案2
1 已采纳 2020-05-15 21:17:08

为完整案例设置阈值以从 R 中的多个列中删除 NA

问题描述

2 个解决方案

解决方案1 1 2020-05-15 10:01:41

解决方案2 1 已采纳 2020-05-15 21:17:08

解决方案1
1 2020-05-15 10:01:41

解决方案2
1 已采纳 2020-05-15 21:17:08