简体   繁体   中英

Finding overlapping ranges and

I have two data frames that contain two variables that are reported in ranges: mzmin, mzmed, mzmax, rtmin, rtmed and rtmax. Being:

table1 <- read.csv("table1.csv")

name        mzmed       mzmin       mzmax       rtmed   rtmin   rtmax
M1          202.1110    202.110859  202.111285  50.35   49.62   51.13
M2          373.144219  373.143792  373.154876  50.38   49.62   51.86
M3          371.14497   371.144256  371.145224  80.34   79.62   81.41
M4          372.147279  372.146992  372.147583  100.35  99.62   101.41

table2 <- read.csv("table2.csv")

name        mzmed       mzmin       mzmax       rtmed   rtmin   rtmax
M1          558.109976  558.102886  558.111497  10.89   9.95    11.95
M2          371.144564  371.144000  371.144999  80.29   79.14   81.98
M3          498.091821  498.091632  498.092225  658.15  656.57  660.96
M4          284.098785  284.098429  284.099092  760.32  758.67  761.2

In this case, M3 of table1 and M2 of table2 I want to be written to a new table because the mz ranges overlap .

It would be beneficial to also have them only be written to the new table if the rt range of M2 and M3 is less than 100 away. I am assuming IRanges is somehow going to be best used, but I am not positive.

Any help or suggestion would be appreciated.

As Uwe Block commented, foverlaps works.

table1 <- data.table(read.table(header = T, 
                   text = "name        mzmed       mzmin       mzmax       rtmed   rtmin   rtmax
M1          202.1110    202.110859  202.111285  50.35   49.62   51.13
                   M2          373.144219  373.143792  373.154876  50.38   49.62   51.86
                   M3          371.14497   371.144256  371.145224  80.34   79.62   81.41
                   M4          372.147279  372.146992  372.147583  100.35  99.62   101.41
"))

table2 <- data.table(read.table(header = T, 
                     text = "name        mzmed       mzmin       mzmax       rtmed   rtmin   rtmax
M1          558.109976  558.102886  558.111497  10.89   9.95    11.95
M2          371.144564  371.144000  371.144999  80.29   79.14   81.98
M3          498.091821  498.091632  498.092225  658.15  656.57  660.96
M4          284.098785  284.098429  284.099092  760.32  758.67  761.2
"))

setkey(table2, mzmin, mzmax)
out <- foverlaps(table1, table2, type="any",nomatch=0L)

> out
   name    mzmed   mzmin   mzmax rtmed rtmin rtmax i.name i.mzmed  i.mzmin  i.mzmax i.rtmed i.rtmin i.rtmax
1:   M2 371.1446 371.144 371.145 80.29 79.14 81.98     M3 371.145 371.1443 371.1452   80.34   79.62   81.41

If you want the range of mz to be within 100 of the range of rt then you could use the following code:

out[abs(mzmin-rtmax)<100 | abs(rtmin-mzmax)<100,]
Empty data.table (0 rows) of 14 cols: name,mzmed,mzmin,mzmax,rtmed,rtmin...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM