[英]Numpy: given a set of ranges, is there an efficient way to find the set of ranges that are disjoint with all other ranges?
有没有一种优雅的方法可以从 numpy 中的一组范围中找到一组不相交的范围?
ranges = [[0,3], [2,4],[5,10]] # there are about 50 000 elements
disjoint_ranges = [] # these are all disjoint
adjoint_ranges = [] # these do not all have to be mutually adjoint
for index, range_1 in enumerate(ranges):
i, j = range_1 # all ranges are ordered s.t. i<j
for swap_2 in ranges[index+1:]: # the list of ranges is ordered by increasing i
a, b, _ = swap_2
if a<j and a>i:
adjoint_swaps.append(swap)
adjoint_swaps.append(swap_2)
else:
if swap not in adjoint_swaps:
swaps_to_do.append(swap)
print(adjoint_swaps)
print(swaps_to_do)
我不确定numpy
但pandas
有以下内容:
from functools import reduce
import pandas as pd
ranges = [
pd.RangeIndex(10, 20),
pd.RangeIndex(15, 25),
pd.RangeIndex(30, 50),
pd.RangeIndex(40, 60),
]
disjoints = reduce(lambda x, y : x.symmetric_difference(y), ranges)
disjoints
Int64Index([10, 11, 12, 13, 14, 20, 21, 22, 23, 24, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
dtype='int64')
在 numpy 数组上循环有点违背了使用 numpy 的目的。您可以利用 accumulate 方法检测不相交的范围。
将范围按下限排序后,您可以累加上限的最大值以确定先前范围对后续范围的覆盖率。 然后将每个范围的下限与之前的范围的范围进行比较,以了解是否存在前向重叠。 然后您只需将每个范围的上限与下一个范围的下限进行比较即可检测向后重叠。 向前和向后重叠的组合将使您能够标记所有重叠范围,并通过消除找到与其他重叠范围完全不相交的范围:
import numpy as np
ranges = np.array( [ [1,8], [10,15], [2,5], [18,24], [7,10] ] )
ranges.sort(axis=0)
overlaps = np.zeros(ranges.shape[0],dtype=np.bool)
overlaps[1:] = ranges[1:,0] < np.maximum.accumulate(ranges[:-1,1])
overlaps[:-1] |= ranges[1:,0] < ranges[:-1,1]
disjoints = ranges[overlaps==False]
print(disjoints)
[[10 15]
[18 24]]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.