简体   繁体   中英

Removed intersection regions in intervals in R

If I have intervals how can I find and remove the intersections in R. For example if I have:

start=c(5,9,1,2,14,18); end=c(10,12,3,4,16,20)
d<-cbind(start, end)
start end
 5  10
 9  12
 1   3
 2   4
14  16
18  20

I want the output to be

start end
 5   8
11  12
 1   1
 4   4
14  16
18  20

The first interval for example intersect with the second one then if the intersection is removed the first interval becomes (5,8) and the second (11,12) because 9 and 10 was included in both intervals so they should be removed. ie tests the intervals if there is any intersection remove the intersection and return the intervals with the new start and end points. I'm wondering how can I code this in R.

This could be what you are looking for:

start <- c(5, 9, 1, 2, 14, 18)
end <- c(10, 12, 3, 4, 16, 20)
d <- cbind(start, end)

# create temporary data frame
temp <- d

# i loops among 1, 2 and 3, because 3 is half the length of vector start
for(i in seq(length(start) / 2)) {
  # both thisLine and nextLine will consider a pair of lines in the data frame
  # thisLine loops among 1, 3 and 5
  thisLine <- (2 * i) - 1
  # nextLine loops among 2, 4 and 6
  nextLine <- (2 * i)

  # if there is an intersection: meaning that start of nextLine is bigger than
  # the start of thisLine AND smaller than the end of thisline
  if((temp[nextLine,]["start"]) > temp[thisLine,]["start"] &
     (temp[nextLine,]["start"] < (temp[thisLine,]["end"]))) {
    # get initial end of thisLine
    initial_end_thisLine <- temp[thisLine,]["end"]
    # set new value for end of thisLine to be the start of nextLine - 1
    temp[thisLine,]["end"] <- temp[nextLine,]["start"] - 1
    # set new value for start of nextline to be the initial end of thisLine
    temp[nextLine,]["start"] <- initial_end_thisLine + 1
  }
}

# get the output
output <- temp

Please notice:

1- using a for loop is not very good in R. I just wanted to write an example of the solution. Better use the apply function family.

2- I understood your question, that you ONLY compare each pair of lines and look for intersections. If you also want to compare all lines with each other, that would need another solution.

3- data frame d is supposed to have even number of lines for this solution to work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM