简体   繁体   English

所有其他列满足条件的子集行 R

[英]Subset rows where all other columns meet a condition R

Hello I ve been trying for 2 days to solve this issue but i have not been able to, I would really appreciate the help, I have the following data frame:您好,我已经尝试了 2 天来解决这个问题,但我一直无法解决,非常感谢您的帮助,我有以下数据框:

在此处输入图像描述

I have 48 columns, one of them is called orthogroups and the other 47 are organisms names, in the Orthogroup column The rows are filled with the orthogroup names while bellow the organism name columns there are numbers that represent the number of copies of the orthogroups from the orthogroup column.我有 48 列,其中一列称为 orthogroups,另外 47 列是生物名称,在 Orthogroup 列中,行中填满了 orthogroup 名称,而在生物名称列下方,有数字表示 orthogroups 的副本数正交列。

I ve been trying to make a subset where I substract orthogroup rows where all values from the adjacent columns are either 0 or one For example OG00001 = 1, 0, 1,0,1 etc. I tried using this command我一直在尝试创建一个子集,在其中减去邻列中所有值都是 0 或 1 的正交组行例如 OG00001 = 1、0、1、0、1 等。我尝试使用此命令

newdf <- subset(Orthogroups.GeneCount, Orthogroups.GeneCount[1:48,] == 1)

Or maybe if there is no Orthogroup where the condition is meet then get those rows where the condition is meet in ax number of columns like in at least 32 columns out of 48 the condition is meet and only shows those 32 orthogroups where is meet, I used to have a command for this but I lost it.或者,如果没有满足条件的Orthogroup ,则在 ax 列数中获取满足条件的行,例如 48 列中至少有 32 列满足条件,并且仅显示满足条件的 32 个正交组,我曾经有一个命令,但我把它弄丢了。 Thanks a lot for the help, I tried with dplyr filter but %>% does not work.非常感谢您的帮助,我尝试使用dplyr过滤器,但%>%不起作用。

Something like this should work (not tested):这样的事情应该有效(未测试):

Orthogroups.GeneCount[
  rowSums(Orthogroups.GeneCount[, -1] == 0 |
            Orthogroups.GeneCount[, -1] == 1) == ncol(Orthogroups.GeneCount) - 1, ]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM