简体   繁体   中英

Subset rows where all other columns meet a condition R

Hello I ve been trying for 2 days to solve this issue but i have not been able to, I would really appreciate the help, I have the following data frame:

在此处输入图像描述

I have 48 columns, one of them is called orthogroups and the other 47 are organisms names, in the Orthogroup column The rows are filled with the orthogroup names while bellow the organism name columns there are numbers that represent the number of copies of the orthogroups from the orthogroup column.

I ve been trying to make a subset where I substract orthogroup rows where all values from the adjacent columns are either 0 or one For example OG00001 = 1, 0, 1,0,1 etc. I tried using this command

newdf <- subset(Orthogroups.GeneCount, Orthogroups.GeneCount[1:48,] == 1)

Or maybe if there is no Orthogroup where the condition is meet then get those rows where the condition is meet in ax number of columns like in at least 32 columns out of 48 the condition is meet and only shows those 32 orthogroups where is meet, I used to have a command for this but I lost it. Thanks a lot for the help, I tried with dplyr filter but %>% does not work.

Something like this should work (not tested):

Orthogroups.GeneCount[
  rowSums(Orthogroups.GeneCount[, -1] == 0 |
            Orthogroups.GeneCount[, -1] == 1) == ncol(Orthogroups.GeneCount) - 1, ]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM