Select columns that don't contain any NA value in R

Question

How to select columns that don't contain any NA values in R? As long as a column contains at least one NA , I want to exclude it. What's the best way to do it? I am trying to use sum(is.na(x)) to achieve this, but haven't been successful.

Also, another R question. Is it possible to use commands to exclude columns that contain all same values? For example,

  column1  column2
row1   a        b  
row2   a        c
row3   a        c

My purpose is to exclude column1 from my matrix so the final result is:

   column2
row1   b  
row2   c
row3   c

Answer 1

Remove columns from dataframe where ALL values are NA deals with the case where ALL values are NA

For a matrix, you can use colSums(is.na(x) to find out which columns contain NA values

given a matrix x

x[, !colSums(is.na(x)), drop = FALSE]

will subset appropriately.

For a data.frame , it will be more efficient to use lapply or sapply and the function anyNA

xdf[, sapply(xdf, Negate(anyNA)), drop = FALSE]

Answer 2

Also if 'mat1' is the matrix:

indx <- unique(which(is.na(mat1), arr.ind=TRUE)[,2])
subset(mat1, select=-indx)

Answer 3

Also, could do

new.df <- df[, colSums(is.na(df)) == 0 ]

this way lets you subset based on the number of NA values in the columns.

Select columns that don't contain any NA value in R

Question

3 answers

solution1
3 ACCPTED 2014-06-13 04:50:50

solution2
0 2014-06-13 09:37:54

solution3
0 2015-06-15 15:53:23

Select columns that don't contain any NA value in R

Question

3 answers

solution1 3 ACCPTED 2014-06-13 04:50:50

solution2 0 2014-06-13 09:37:54

solution3 0 2015-06-15 15:53:23

solution1
3 ACCPTED 2014-06-13 04:50:50

solution2
0 2014-06-13 09:37:54

solution3
0 2015-06-15 15:53:23