简体   繁体   中英

Find overlapping ranges based on positions in R

I have two datasets:

 chr1 25 85
 chr1 2000 3000
 chr2 345 2300

and the 2nd,

chr1 34 45 1.2
chr1 100 1000
chr2 456 1500 1.3

This is my desired output,

chr1 25 85 1.2
chr2 345 2300 1.3

Below is my code:

sb <- NULL
rangesC <- NULL
sb$bin <- NULL
for(i in levels(df1$V1)){
   s <- subset(df1, df1$V1 == i)
   sb <- subset(df2, df2$V1 == i)
   for(j in 1:nrow(sb)){
     sb$bin[j] <-s$V4[(s$V2 <= sb$V2[j] & s$V3 >= sb$V3[j])]
  }
 rangesC <- try(rbind(rangesC, sb),silent = TRUE)
}

The error I get is :

replacement has length zero OR when I use as.character rangesC is empty.

I would like to get the V4 corresponding if the positions overlap. What is going wrong?

The foverlaps() function from the data.table package does an overlap join of two data.tables:

library(data.table)
setDT(df1, key = names(df1))
setDT(df2, key = key(df1))
foverlaps(df2, df1, nomatch = 0L)[, -c("i.V2", "i.V3")]
  V1 V2 V3 V4 1: chr1 25 85 1.2 2: chr2 345 2300 1.3 

Data

library(data.table)
df1 <- fread(
  "chr1 25 85
 chr1 2000 3000
 chr2 345 2300", header = FALSE
)

df2 <- fread(
  "chr1 34 45 1.2
chr1 100 1000 
chr2 456 1500 1.3", header = FALSE
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM