Using apply function on a matrix with NA entries

I read Data from a csv file. If I see this file in R, I have:

  V1 V2  V3 V4  V5 V6 V7
1 14 25  83 64 987 45 78
2 15 65 789 32  14 NA NA
3 14 67  89 14  NA NA NA

If I want the maximum value in each column, I use this:


and this is the result:

 V1  V2  V3  V4  V5  V6  V7 
 15  67 789  64  NA  NA  NA 

but it works on the column that has no NA . How can I change my code, to compare columns with NA too?

You just need to add na.rm=TRUE to your apply call.


Note: This does assume every column has at least one data point. If one does not sum will return 0 .


fft does not have an na.rm argument. Therefore, you will need to write your own function.


For example:

df <- data.frame(matrix(5,5,5))
df[,3] <- NA

> df
  X1 X2 X3 X4 X5
1  5  5 NA  5  5
2  5  5 NA  5  5
3  5  5 NA  5  5
4  5  5 NA  5  5
5  5  5 NA  5  5

> apply(df,2,function(x){fft(x[!is.na(x)])})
[1] 2.500000e+01+0i 1.776357e-15+0i 1.776357e-15+0i 1.776357e-15+0i
[5] 1.776357e-15+0i

[1] 2.500000e+01+0i 1.776357e-15+0i 1.776357e-15+0i 1.776357e-15+0i
[5] 1.776357e-15+0i


[1] 2.500000e+01+0i 1.776357e-15+0i 1.776357e-15+0i 1.776357e-15+0i
[5] 1.776357e-15+0i

[1] 2.500000e+01+0i 1.776357e-15+0i 1.776357e-15+0i 1.776357e-15+0i
[5] 1.776357e-15+0i

Another option:

sapply(apply(df,2,na.exclude), fft)

EDIT: the code above may fail if apply() returns a matrix instead of a list. And this will happen if there are no NA s for instance. The code below fixes that:

sapply(tapply(m, col(m), na.exclude), max)

Interesting, there is no need to set simplify=FALSE , as the result of tapply() will only be simplified if na.exclude() returns a single scalar per column; and in this case sapply works in the same way.

Another option, this will return -Inf if all elements of col are NA

df<-structure(list(x = c(10, 12, 13), y = c(12, 13, NA), z = c(NA_real_, 
NA_real_, NA_real_)), .Names = c("x", "y", "z"), row.names = c(NA, 
-3L), class = "data.frame")

kk<-Map(function(x) max(na.omit(df[,x])),as.list(names(df)))

> ll

x   13
y   13
z -Inf

This might be a result of a posterior version but you could actually do:

apply(df,2,function(x) max(x,na.rm=T))

which will return you a vector or equivalently:

lapply(df,function(x) max(x,na.rm=T))

which will return you a list. Notice that whenever one of the columns in df is a character it will fail returning all NA's. In this case you may need to do a prior select of the objective variables.

Another option is to use the following:


na.omit(df) will simply remove incomplete cases from each column of your data frame df and then the apply() function will yield the max value for each of the columns.

