简体   繁体   中英

Python representation for a set of non-overlapping integer ranges

I'd like to represent a set of integer ranges using Python where the set could be modified dynamically and tested for inclusion. Specifically I want to apply this to address ranges or line numbers in a file.

I could define the range of addresses I cared about to include:

200 - 400  
450 - 470  
700 - 900  

Then I want to be able to add a potentially overlapping range to the set such that when I add 460 - 490 the set becomes:

200 - 400  
450 - 490  
700 - 900  

But then be able to delete from the set where I could exclude the range 300 - 350 and the set becomes:

200 - 300
350 - 400  
450 - 490  
700 - 900  

Finally, I want to be able to iterate over all integers included in the set, or test whether the set contains a particular value.

I'm wondering what the best way to do this is (particularly if there's something built into Python).

You're describing an interval tree .

pip install intervaltree

Usage:

from intervaltree import IntervalTree, Interval
tree = IntervalTree()
tree[200:400] = True  # or you can use ranges as the "values"
tree[450:470] = True
tree[700:900] = True

Querying:

>>> tree
IntervalTree([Interval(200, 400, True), Interval(450, 470, True), Interval(700, 900, True)])
>>> tree[250]
{Interval(200, 400, True)}
>>> tree[150]
set()

Adding overlapping range:

>>> tree[450:490] = True
>>> tree
IntervalTree([Interval(200, 400, True), Interval(450, 470, True), Interval(450, 490, True), Interval(700, 900, True)])
>>> tree.merge_overlaps()
>>> tree
IntervalTree([Interval(200, 400, True), Interval(450, 490), Interval(700, 900, True)])

Discarding:

>>> tree.chop(300, 350)
>>> tree
IntervalTree([Interval(200, 300, True), Interval(350, 400, True), Interval(450, 490), Interval(700, 900, True)])

I implemented a similar thing in a different language.

Basic ideas:

  • Keep a tree of ranges ordered by the left bound. Instead of a real tree, you can keep a Python list and use bisect to search it in log time. This gives you the lookup / inclusion test.
  • Represent all operations as sub-range operations. A single element operation just works on a sub-range of length 1 internally.
  • Implement the basic sub-range operations: addition, which is simple, and subtraction, which may end up with two sub-ranges if a middle is excluded, or one subrange, possibly empty.
  • After an addition, check the immediate neighbor subranges, and maybe merge with one or both of them, if the subranges intersect. Continue in both directions until no merge operations occur.

This should get you started.

Using python built-in functions you could use something along the lines of:

range_1 = range(450, 470)
range_temp = range(460, 490)

if any(num in range_1 for num in range_temp):
    start = min(range_1.start, range_temp.start)
    stop = max(range_1.stop, range_temp.stop)
    range_1 = range(start, stop)

Of course there would need to be further logic to check the overlaps of multiple range intervals.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM