R: Select rows in which at least half of the columns satisfy a certain condition

Question

I have this problem I am simply not able to figure out. I imported a table into R which has 20 columns and around 20,000 rows. The columns have numeric values. I was wondering how I could only select the rows in which at least half the columns have a value greater than 20. To explain clearly, I only want to select the rows in which at least 10 columns have a value greater than 20.

I know how to select rows in which any one column has to have a value greater than 20. For that, I used this code:

    y=Table[apply(Table[, -1], MARGIN = 1, function(x) any(x > 20)), ]

Is there any way to do the same such that at least half the columns have a value greater than 20?

Thanks!

Answer 1

Using your approach we could just use a different function instead of any()

y=Table[apply(Table[, -1], MARGIN = 1, function(x) {sum(x>20)*2 >= length(x)}), ]

============== EDIT ================

Simpler and faster would be to avoid apply() .

Y <- Table[rowSums(Table > 20) >= 10,]

R: Select rows in which at least half of the columns satisfy a certain condition

Question

1 answers

solution1
0 ACCPTED 2014-08-10 22:55:05

R: Select rows in which at least half of the columns satisfy a certain condition

Question

1 answers

solution1 0 ACCPTED 2014-08-10 22:55:05

solution1
0 ACCPTED 2014-08-10 22:55:05