简体   繁体   中英

Identify NA's in sequence row-wise

I want to fill NA values in a sequence, which is row-wise, based on a condition. Please see example below.

ID | Observation 1 | Observation 2 | Observation 3 | Observation 4 | Observation 5
 A         NA              0               1             NA             NA

The condition is:

  • all NA values before !NA values in the sequence should be left as NA;
  • but all NAs after !NA values in the sequence should be tagged ("remove")

In the example above, NA value in Observation 1 should remain NA. However, the NA values in Observations 4 and 5 should be changed to "Remove".

You can define the function:

replace.na <- function(r,val) {
  i <- is.na(r)
  j <- which(i)
  k <- which(!i)
  r[j[j > k[length(k)]]] <- val
  r
}

Then, assuming that you have a data.frame like so:

r <- data.frame(ID=c('A','B'),obs1=c(NA,1),obs2=c(0,NA),obs3=c(1,2),obs4=c(NA,3),obs5=c(NA,NA))
##  ID obs1 obs2 obs3 obs4 obs5
##1  A   NA    0    1   NA   NA
##2  B    1   NA    2    3   NA

We can apply the function over the rows for all numeric columns of r :

r[,-1] <- t(apply(r[,-1],1,replace.na,999))    
##  ID obs1 obs2 obs3 obs4 obs5
##1  A   NA    0    1  999  999
##2  B    1   NA    2    3  999

This treats r[,-1] as a matrix and the output of apply fills a matrix , which by default is filled by columns. Therefore, we have to transpose the resulting matrix before replacing the columns back into r .

Another way to call replace.na is:

r[,-1] <- do.call(rbind,lapply(data.frame(t(r[,-1])),replace.na,999))

Here, we transpose the numeric columns of r first and make that a data.frame . This makes each row of r a column in the list of columns that is the resulting data frame. Then use lapply over these columns to apply replace.na and rbind the results.


If you want to flag all NA 's after the first non- NA , then the function replace.na should be:

replace.na <- function(r,val) {
  i <- is.na(r)
  j <- which(i)
  k <- which(!i)
  r[j[j > k[1]]] <- val
  r
}

Applying it to the data:

r[,-1] <- do.call(rbind,lapply(data.frame(t(r[,-1])),replace.na,999))
##  ID obs1 obs2 obs3 obs4 obs5
##1  A   NA    0    1  999  999
##2  B    1  999    2    3  999

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM