简体   繁体   中英

Apply Min or Max function to arrays when NA exist in R

I have a question looks simple but really drives me crazy. I really need your help.

First lets generate some data.frame

a<-c(rep(1:2,2),NA,NA)
b<-c(rep(NA,3),3,4,NA)
df<-cbind(a,b)

This will give a table as:

      a  b
[1,]  1 NA
[2,]  2 NA
[3,]  1 NA
[4,]  2  3
[5,] NA  4
[6,] NA NA

Now I need a third column which will be:

  1. When both a and b are not NA, return the max value in both.

  2. When one of them are not NA, return the non-NA number

  3. When both them are NA, return NA.

To sum up, I am looking for the result like this:

      a  b  c
[1,]  1 NA  1
[2,]  2 NA  2
[3,]  1 NA  1
[4,]  2  3  3
[5,] NA  4  4
[6,] NA NA NA

I tried df$c<-max(df$a,df$b) and obviously this doesn't work and give me:

Error in df$a : $ operator is invalid for atomic vectors

Could someone help me please? Thank you very much!!

You could try pmax after converting the dataset ('df' is 'matrix') to 'data.frame'

cbind(df, c=do.call(`pmax`, c(as.data.frame(df), list(na.rm=TRUE))))
#      a  b  c
#[1,]  1 NA  1
#[2,]  2 NA  2
#[3,]  1 NA  1
#[4,]  2  3  3
#[5,] NA  4  4
#[6,] NA NA NA

If you need the "min" value for each row, replace pmax with pmin . To create a 'data.frame', you could use

df <- data.frame(a, b)

cbind get the output as 'matrix'. $ operator won't work with 'matrix', so it is better to use [

You can also use the "regular" max function:

df <- cbind(df, c = apply(df, 1, function(x) ifelse(all(is.na(x)), NA, max(x, na.rm=T))))

df
#      a  b  c
#[1,]  1 NA  1
#[2,]  2 NA  2
#[3,]  1 NA  1
#[4,]  2  3  3
#[5,] NA  4  4
#[6,] NA NA NA

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM