In R, How do I filter a data frame to only include rows with >=2 non-NA values?

Question

Suppose I have a dataframe

      Grp1 Grp2 Grp3
Trt1    NA    1   NA
Trt2     2    3   NA
Trt3     4   NA    5

I'd like to filter this down to only include rows where the number of non-NA values is greater than some total (in this case 2). So for this example I would like a result:

      Grp1 Grp2 Grp3
Trt2     2    3   NA
Trt3     4   NA    5

Answer 1

You could use rowSums() and is.na() to filter the dataframe. This will coerce the values you are using to filter into a matrix (so it may have issues with very large dataframes), but it should do the trick.

df1[rowSums(!is.na(df1)) >= 2, ]
     Grp1 Grp2 Grp3
Trt2    2    3   NA
Trt3    4   NA    5

Data :

df1 <- read.table(header = T, text = "      Grp1 Grp2 Grp3
Trt1    NA    1   NA
Trt2     2    3   NA
Trt3     4   NA    5")

Answer 2

You can do it this way:

count_na <- apply(data, 1, function(x) sum(is.na(x)))
data[count_na < 2,]

sample data:

  col1 col2 col3
1    1    1   NA
2   NA   NA    2
3   NA    3    3

new output:

  col1 col2 col3
1    1    1   NA
3   NA    3    3

Answer 3

Another option:

data[apply(data,1,function(x) sum(!is.na(x)) >= 2),]

In R, How do I filter a data frame to only include rows with >=2 non-NA values?

Question

3 answers

solution1
6 ACCPTED 2019-11-26 17:05:17

solution2
2 2019-11-26 17:06:11

solution3
1 2019-11-26 17:09:11

In R, How do I filter a data frame to only include rows with >=2 non-NA values?

Question

3 answers

solution1 6 ACCPTED 2019-11-26 17:05:17

solution2 2 2019-11-26 17:06:11

solution3 1 2019-11-26 17:09:11

solution1
6 ACCPTED 2019-11-26 17:05:17

solution2
2 2019-11-26 17:06:11

solution3
1 2019-11-26 17:09:11