I have this problem I am simply not able to figure out. I imported a table into R which has 20 columns and around 20,000 rows. The columns have numeric values. I was wondering how I could only select the rows in which at least half the columns have a value greater than 20. To explain clearly, I only want to select the rows in which at least 10 columns have a value greater than 20.
I know how to select rows in which any one column has to have a value greater than 20. For that, I used this code:
y=Table[apply(Table[, -1], MARGIN = 1, function(x) any(x > 20)), ]
Is there any way to do the same such that at least half the columns have a value greater than 20?
Thanks!
Using your approach we could just use a different function instead of any()
y=Table[apply(Table[, -1], MARGIN = 1, function(x) {sum(x>20)*2 >= length(x)}), ]
============== EDIT ================
Simpler and faster would be to avoid apply()
.
Y <- Table[rowSums(Table > 20) >= 10,]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.