简体   繁体   English

这些数字范围相互重叠吗

[英]do these number ranges overlap each other

I've been using two different target prediction programs to predict binding sites on genes and using R to process the results that i get 我一直在使用两个不同的目标预测程序来预测基因上的结合位点,并使用R来处理我得到的结果

The problem is that the programs give different number of targets per gene and the locations are slightly different. 问题在于程序给每个基因提供不同数量的靶标,并且位置略有不同。 What i was trying to do was to see if these sites are the same, or at least, if I have the Start position and the Stop position, do these ranges overlap between programs. 我试图做的是查看这些站点是否相同,或者至少,如果我具有开始位置和停止位置,这些范围在程序之间是否重叠。

Say I have two programs X and Y; 假设我有两个程序X和Y;

X predicts two sites, x1 is the start positions for both sites, x2 is the stop position. X预测两个站点,x1是两个站点的开始位置,x2是停止位置。 Same for y 与y相同

x1<-c(1521,1259)
x2<-c(1544,1282)

y1<-c(1825,1522,1259,362)
y2<-c(1848,1543,1282,384)

So both of the X sites overlap sites in the Y. And output those positions in a table: 因此,两个X站点都与Y站点重叠。然后在表格中输出这些位置:

|   x1     |   x2     |   y1     |   y2     |

|   1521   |   1544   |   1522   |   1543   |
|   1259   |   1282   |   y1259  |   1282   |

What I was originally thinking, was that if I only had one site for each program, then doing the following will tell me if they overlap or not. 我最初的想法是,如果每个程序只有一个站点,那么执行以下操作将告诉我它们是否重叠。 (the stop posiiton of y, should be larger than the start position x and stop position of x is larger than y) (y的停止位置,应大于起始位置x,x的停止位置应大于y)

x1 <= y2 && y1 <= x2

I'm not sure how I could do the same for my problem, at least, not without writing a lot of loops and ifs. 我不确定如何至少可以解决我的问题,而不是不编写大量循环和ifs。

The IRanges package (and GenomicRanges for genomic data, when chromosome and possibly strand are important) allows you to define ranges IRanges程序包(当染色体和可能的链很重要时,还有用于基因组数据的GenomicRanges )允许您定义范围

library(IRanges)
x <- IRanges(x1, x2)
y <- IRanges(y1, y2)

and ask questions about them 并询问有关他们的问题

y %over% x     # any type of overlap
y %within% x   # strictly within

see ?findOverlaps for more detail, the package vignettes (from the landing pages, above), these publications a , b for a general introduction, and the Bioconductor support site if the ranges infrastructure seems useful. 有关更多详细信息,请参见?findOverlaps ,程序包?findOverlaps晕(来自上面的登录页面),有关一般介绍的这些出版物ab以及如果范围基础设施似乎有用的话,请访问Bioconductor 支持网站

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM