简体   繁体   English

如果行在第二个表的列中包含值,则过滤表以排除行

[英]Filter table to exclude rows if they contain a value in a column of a second table

I have got one main table with over 608,000 rows (the first 7 rows are depicted below).我有一个超过 608,000 行的主表(前 7 行如下所示)。 These correspond to locations in the genome along with one (or two identifiers) given to them by Affymetrix (and dbSNP).这些对应于基因组中的位置以及 Affymetrix(和 dbSNP)给它们的一个(或两个标识符)。

Affy SNP ID dbSNP RS ID Chromosome  Chromosome Start
Affx-26018273   rs10056215  5   163542505

I then have another table with only 46 rows.然后我有另一个只有 46 行的表。 I need to remove the rows from the main table if they have both the Chromosome and Chromosome Start values found in one of the 46 rows in the second table.如果在第二个表的 46 行之一中找到了 Chromosome 和 Chromosome Start 值,我需要从主表中删除这些行。 Here is the second table;这是第二张表; it does not have the Affymetrix/dbSNP identifiers.它没有 Affymetrix/dbSNP 标识符。

1   5641055

How can I filter out these records?如何过滤掉这些记录?

Using R , you can remove all the rows from Tab1 that have in the last column a number that appears in the second column of the 46-row table Tab2 with使用R ,您可以从Tab1中删除所有在最后一列中具有出现在 46 行表Tab2的第二列中的数字的行

 Tab1 <- Tab1[-which(Tab1[,ncol(Tab1)] %in% Tab2[,2]),]

Hope this helps.希望这可以帮助。

You could use the anti_join function from the dplyr package, or that package's filter function.您可以使用dplyr包中的anti_join函数,或该包的filter函数。

Say your data.frame was the built-in mtcars and you wanted to filter out cars with cylinder values from the following data.frame, ie, with 4 or 6 cylinders:假设您的 data.frame 是内置的mtcars并且您想从以下 data.frame 中过滤出具有汽缸值的汽车,即具有 4 或 6 个汽缸:

dontuse <- data.frame(cyl = c(4,6), blah = c(1,2))

You could run:你可以运行:

anti_join(mtcars, dontuse)

or或者

mtcars %>%
  filter(! cyl %in% dontuse$cyl)

Both of these return rows where cyl is not 4 or 6.这两个都返回cyl不是 4 或 6 的行。

    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
1  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
2  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
3  16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
4  17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
5  15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
6  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
7  10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
8  14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
9  15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
10 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
11 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
12 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
13 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
14 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM