简体   繁体   English

使用基数 R 根据另一个条件中的值从表中删除行

[英]Removing rows from a table based off values in another conditional on a value in a separate column using base R

I have a table:我有一张桌子:

ID Phenotype 
AA 1
AB 1
AC 0
AD 1
AE 0
AF 1
AG 0

I have a list of IDs of those with the "1" phenotype that I want to subset from the other "1" phenotypes.我有一个具有“1”表型的 ID 列表,我想从其他“1”表型中分离出来。 I want to keep all the "0" phenotypes.我想保留所有“0”表型。

Say the list read: AB, AD说出清单:AB,AD

The desired outcome would be:期望的结果是:

        ID Phenotype 
        AA 1
        AB 1
        AC 0
        AD 1
        AE 0
        AG 0

IE AF would have been removed as it was a phenotype "1" but was not on the list and all the phenotype "0" have remained untouched. IE AF 将被删除,因为它是表型“1”,但不在列表中,所有表型“0”都保持不变。

In reality table and the list are thousands of entries long.实际上,表和列表有数千个条目。 All the IDs are unique.所有的 ID 都是唯一的。

I work on a HPC which is airlocked to outside tools so base R solutions are preferred.我在 HPC 上工作,它与外部工具密不可分,因此首选基本 R 解决方案。 I can subset the table into phenotypes 1 and 0, remove those I do not want and then rejoin the table but I was wondering if there was a one-liner way of doing this?我可以将表格子集化为表型 1 和 0,删除那些我不想要的,然后重新加入表格,但我想知道是否有一种单行方式可以做到这一点?

Many thanks非常感谢

A Base R one-liner would be:一个 Base R单行将是:

Code:代码:

df[df[, 1] %in% v | df[, 2] == 0, ]

# checks which data.frame entries are matched in the supplied vector

df[, 1] %in% v 

# checks which second column entries equal 0

df[, 2] == 0

# then we just utilize | to tell R to accept entries that satisfy either of our 
# conditions 

Data:数据:

df <- read.table(text = "ID Phenotype 
AA 1
AB 1
AC 0
AD 1
AE 0
AF 1
AG 0", header = T)

v <- c("AB", "AD")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM