简体   繁体   中英

subsetting large data frames with condition

I have got the following dataset:

ID  s1  s2  s3
A   0.6 1   0.3
B   3   0.4 0.4
C   3   2   1
D   0   0.3 0.2
E   3   2   0.1

i would like to retain the rows which have the value >=0.5 at least two of the 3 samples

So, the new data frame would be:

ID  s1  s2  s3
A   0.6 1   0.3
C   3   2   1
E   3   2   0.1

Thanks in advance

You can do

df[rowSums(df[-1] > 0.5) >= 2, ]
#  ID  s1 s2  s3
#1  A 0.6  1 0.3
#3  C 3.0  2 1.0
#5  E 3.0  2 0.1

We create a logical matrix df[-1] > 0.5 and check if at least two values per row are TRUE .

data

df <- read.table(text="ID  s1  s2  s3
A   0.6 1   0.3
B   3   0.4 0.4
C   3   2   1
D   0   0.3 0.2
E   3   2   0.1", header = TRUE, stringsAsFactor = FALSE)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM