I need to save the lines from "map" only when the rows are in the interval from the "ref" table:
Follow example to the "map" table:
map<-"chr start tag depth BCV State
chr1 1 chr1-1 1 2 1
chr1 2 chr1-2 1 3 2
chr1 3 chr1-3 1 2 3
chr1 4 chr1-4 2 2 4
chr2 5 chr2-5 2 2 5
chr2 1 chr2-1 2 2 6
chr2 2 chr2-2 3 2 4
chr2 3 chr2-3 3 2 3
chr2 4 chr2-4 3 2 2
chr2 5 chr2-5 3 2 1
chr2 6 chr2-6 3 2 7
chr2 7 chr2-7 3 2 9
chr2 8 chr2-8 2 2 2
chr2 9 chr2-9 2 2 1"
map<-read.table(text=map,header=T)
And I have a reference map like this example:
ref<-"chr start end
chr1 1 2
chr1 2 3
chr1 5 6
chr2 7 9"
ref<-read.table(text=ref,header=T)
And I need a final table like this:
final<-"chr start tag depth BCV State
chr1 1 chr1-1 1 2 1
chr1 2 chr1-2 1 3 2
chr1 3 chr1-3 1 2 3
chr2 7 chr2-7 3 2 9
chr2 8 chr2-8 2 2 2
chr2 9 chr2-9 2 2 1"
final<-read.table(text=final,header=T)
As this was tagged with data.table
tag, here's a simple data.table::forverlaps
solution
setDT(map)[, end := start]
setkey(setDT(ref))
indx <- unique(foverlaps(map, ref, which = TRUE, nomatch = 0L)$xid)
map[indx]
# chr start tag depth BCV State end
# 1: chr1 1 chr1-1 1 2 1 1
# 2: chr1 2 chr1-2 1 3 2 2
# 3: chr1 3 chr1-3 1 2 3 3
# 4: chr2 7 chr2-7 3 2 9 7
# 5: chr2 8 chr2-8 2 2 2 8
# 6: chr2 9 chr2-9 2 2 1 9
This is basically adds an end
column to map
in order to close the intervals, key
the ref
data set in order to define the matching intervals for foverlaps
while chr
is also included. Then just running foverlaps
while removing the unmatched values and selecting the unique
overlaps in case the intervals in ref
are overlapping. Finally just subsetting map
according to the index.
First, you need to expand the intervals:
L <- lapply(split(ref,ref$chr), function(d) unique(unlist(mapply(seq,d$start,d$end,SIMPLIFY = F))))
which will give you:
#$chr1
#[1] 1 2 3 5 6
#$chr2
#[1] 7 8 9
And then you can merge:
ref2 <- setNames(stack(L),c('start','chr'))
merge(map,ref2)
Final output:
# chr start tag depth BCV State
#1 chr1 1 chr1-1 1 2 1
#2 chr1 2 chr1-2 1 3 2
#3 chr1 3 chr1-3 1 2 3
#4 chr2 7 chr2-7 3 2 9
#5 chr2 8 chr2-8 2 2 2
#6 chr2 9 chr2-9 2 2 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.