简体   繁体   中英

find indices of overlapping ranges in R

my frame looks like this

4 8
6 9
1 2
5 7
10 14
3 9

in which the first col ist the start and the other col the end of a measure. I now want to return the indices of those rows which partly overlap a specific row. Example would row 1. The indices would be 2,4,6 - as these partly overlap. I need to make this comparison very frequently so an efficient solution would be great

note that i am looking not only for partly overlap but also complete overlap (3 9) ..

Here's a possible solution using foverlaps() function from the data.table package.

Set column names and pick the row index:

library(data.table)
cols <- c("start", "end")
indx <- 1L

Convert your data to a data.table object, set the column names and separate the specific row from the rest of the data and key it (this is an essential step - check ?foverlaps for more).

setnames(setDT(df), cols)
temp <- setkeyv(df[indx], cols)

Run the foverlaps function. You can choose which type of overlap you want in the type parameter

foverlaps(df[-indx], temp, which=TRUE, 
          type="any", nomatch=0L)$xid + 1 
## [1] 2 4 6

You could use "IRanges" package:

library(IRanges)

findOverlaps(IRanges(DF$V1, DF$V2), IRanges(DF$V1[1], DF$V2[1]))@queryHits
#[1] 1 2 4 6

Or generate all overlaps at once and subset later:

overls = findOverlaps(IRanges(DF$V1, DF$V2), ignoreSelf = TRUE)
split(subjectHits(overls), queryHits(overls))

subjectHits(overls)[queryHits(overls) == 1]
#[1] 2 4 6

"DF":

DF = structure(list(V1 = c(4L, 6L, 1L, 5L, 10L, 3L), V2 = c(8L, 9L, 
2L, 7L, 14L, 9L)), .Names = c("V1", "V2"), class = "data.frame", row.names = c(NA, 
-6L))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM