Problem: I want to check if a row contains solely of NA's in a data.table
object. Currently, I have an implementation depending on apply
. Is there a more efficient while readable solution?
Any improvements and ideas are welcome! Thanks
dt <- data.table(
x = c("A", "B", "C", "D"),
y = c("true", NA, NA, "true"),
z = c(NA, NA, "true", "true"),
a = c(NA, NA, NA, "ha")
)
# Current Code:
idx <- apply(dt[, c(2:ncol(dt)), with = FALSE], 1, function(x) all(is.na(x)))
dt <- dt[!idx]
# Code Attempt 1 (not so nice due to temp na_count column)
rel_cols <- names(dt)[!names(dt) %in% c("x")]
dt[, na_count := rowSums(is.na(.SD)), .SDcols = rel_cols][na_count < (ncol(dt) - 2)]
You can use rowSums
like this :
library(data.table)
dt[rowSums(!is.na(dt[, ..rel_cols])) > 0]
# x y z a
#1: A true <NA> <NA>
#2: C <NA> true <NA>
#3: D true true ha
Or using .SDcols
:
dt[dt[, rowSums(!is.na(.SD)) > 0, .SDcols = rel_cols]]
Here is one base R option:
library(data.table)
dt[, rowSums(is.na(dt)) == ncol(dt)]
x y z a
1: <NA> <NA> <NA> <NA>
Data:
dt <- data.table(
x = c("A", NA, "C", "D"),
y = c("true", NA, NA, "true"),
z = c(NA, NA, "true", "true"),
a = c(NA, NA, NA, "ha")
)
Note: I intentionally slightly altered your sample data to make the second row of the data table all NA
values, to demonstrate the answer is working.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.