简体   繁体   中英

R: Select rows in which at least half of the columns satisfy a certain condition

I have this problem I am simply not able to figure out. I imported a table into R which has 20 columns and around 20,000 rows. The columns have numeric values. I was wondering how I could only select the rows in which at least half the columns have a value greater than 20. To explain clearly, I only want to select the rows in which at least 10 columns have a value greater than 20.

I know how to select rows in which any one column has to have a value greater than 20. For that, I used this code:

    y=Table[apply(Table[, -1], MARGIN = 1, function(x) any(x > 20)), ]  

Is there any way to do the same such that at least half the columns have a value greater than 20?

Thanks!

Using your approach we could just use a different function instead of any()

y=Table[apply(Table[, -1], MARGIN = 1, function(x) {sum(x>20)*2 >= length(x)}), ]

============== EDIT ================

Simpler and faster would be to avoid apply() .

Y <- Table[rowSums(Table > 20) >= 10,]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM