简体   繁体   中英

what is the most elegant way to check for patterns of missing data in R?

I have a set of numeric vectors in R each length 16. I would like to select those vectors that have all values present in one of four positions: 1:4, 5:8, 9:12, 13:16

eg vector c(NA, 1, NA, 1, 1, 1, 1, 1, NA, NA, 1, NA, NA, 1, NA, 1, NA) would pass the test, since positions 5:8 are all non NA.

What is the most elegant (ie using minimum easy-to-read code) way to test this?

With a list of indices, you can iterate over those ranges and look for ones without any NA :

vec <- c(NA, 1, NA, 1, 1, 1, 1, 1, NA, NA, 1, NA, NA, 1, NA, 1, NA)
sapply(list(1:4, 5:8, 9:12, 13:16),
       function(ind) !anyNA(vec[ind]))
# [1] FALSE  TRUE FALSE FALSE

If you want to return the values within those indices:

inds <- list(1:4, 5:8, 9:12, 13:16)
good <- sapply(inds, function(ind) !anyNA(vec[ind]))
# should check that `any(good)` is true
inds[[ which(good)[1] ]]
# [1] 5 6 7 8
vec[ inds[[ which(good)[1] ]] ]
# [1] 1 1 1 1

Here is an option with rleid to get the run-length-encoding id of the vector, use that as grouping variable to check if any of the sequence have full set of non-NA elements

library(data.table)
any(as.logical(ave(seq_along(v1) * v1, rleid(v1),
         FUN = function(x) all(!is.na(x))) ))
#[1] TRUE

Or it could be also

any(with(rle(!is.na(v1)), lengths[values] >=4))
#[1] TRUE

Or another option is table

4 %in% table(v1 * (seq_along(v1) -1) %/% 4)
#[1] TRUE

data

v1 <- c(NA, 1, NA, 1, 1, 1, 1, 1, NA, NA, 1, NA, NA, 1, NA, 1, NA)

The following code will return a single value ( TRUE or FALSE ). It returns TRUE if the vector passes the test.

vec <- c(NA, 1, NA, 1, 1, 1, 1, 1, NA, NA, 1, NA, NA, 1, NA, 1, NA)

!all(tapply(vec, rep(1:length(vec), each = 4, len = length(vec)), anyNA))
# [1] TRUE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM