Given a set of ranges as follow
dates = [[1200, 1300], [1100, 1300], [1200, 1300], [1200, 1400], [1100, 1400]]
I would like to extract in an efficient way all the possible intervals and then count the number of ranges available in each interval.
For that example the resulting matrix of possible intervals would be:
[1100, 1200] [1200, 1300] [1300, 1400]
0 0 1 0
1 1 1 0
2 0 1 0
3 0 1 1
4 1 1 1
Then, the sum by column gives the number of ranges that are in each interval
[1100, 1200] 2
[1200, 1300] 5
[1300, 1400] 2
Here is an approach giving you the wanted numpy
matrix m
, with boolean values:
def getOverlap(a, b):
return max(0, min(a[1], b[1]) - max(a[0], b[0]))
nodes = sorted(np.unique(np.array(dates).flatten()))
intervals = zip(nodes[:-1], nodes[1:])
# [(1100, 1200), (1200, 1300), (1300, 1400)]
m = np.array([[bool(getOverlap(i, d)) for d in dates] for i in intervals])
m.sum(axis=1)
# array([2, 5, 2])
Note that if you want the 'matrix' to be a pandas
DataFrame
, simply do:
pd.DataFrame(m.transpose().astype(int), columns=intervals)
(1100, 1200) (1200, 1300) (1300, 1400)
0 0 1 0
1 1 1 0
2 0 1 0
3 0 1 1
4 1 1 1
I have followed this method here. Can be more compact than this. Thats for another day!
c=[[1200, 1300], [1100, 1300], [1200, 1300], [1200, 1400], [1100, 1400]]
print "string values", c
uniquea={}
new=[]
for i in c:
j=str(i)
if j in new:
uniquea[j]+=1
else:
uniquea[j]=1
new.append(j)
print uniquea, new
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.