简体   繁体   中英

Split intervals by overlap into all periods and get maximum score

Given two timelines, A and B, where each timeline consist of non-overlapping periods and a score is associated with each period. I want to find all the periods and maximum score for the overlap between the timelines, while keeping the periods that does not have an overlap.

Ie for the periods that overlap, I want the start, end and maximum score of A and B. For the periods that does not overlap, i still want the start and end, and the associated score.

Periods in each timeline can line up with each other. For cases where the timelines overlap, the score should be consided included in the start and end points.

Visual explanation

An example is shown in the diagram below where with numbers for reference, and the score shown for each interval. A, B are timelines, and R is the resulting timeline i want to calculate.

    0        1         2         3         4         5
    12345678901234567890123456789012345678901234567890
A =     |---------|        |--|    |--------| |--|
B = |--------|--------| |--------|    |--|        |--|

R = |---|----|----|---| |--|--|--| |--|--|--| |--||--|

Scores per interval:
A =     |---10----|        |20|    |---30---| |40|
B = |---6----|---12---| |---17---|    |33|        |50|

R = |6--|10--|12--|12-| |17|20|17| |30|33|30| |40||50|

Example code

DROP TABLE IF EXISTS #TABLE_A
CREATE TABLE #TABLE_A (
timeline varchar(1),
time_start int,
time_end int,
score int
)

DROP TABLE IF EXISTS #TABLE_B
CREATE TABLE #TABLE_B (
timeline varchar(1),
time_start int,
time_end int,
score int
)

DROP TABLE IF EXISTS #TABLE_R
CREATE TABLE #TABLE_R (
time_start int,
time_end int,
score int
)

INSERT INTO #TABLE_A 
VALUES 
('A', 5, 15, 10)
,('A', 24, 27, 20)
,('A', 32, 41, 30)
,('A', 43, 46, 40)

INSERT INTO #TABLE_B 
VALUES 
('B', 1, 10, 6)
,('B', 10, 19, 12)
,('B', 21, 30, 17)
,('B', 35, 38, 33)
,('B', 47, 50, 50)

INSERT INTO #TABLE_R 
VALUES 
(1, 5, 6)
,(5, 10, 10)
,(10, 15, 12)
,(15, 19, 12)
,(21, 24, 17)
,(24, 27, 20)
,(27, 30, 17)
,(32, 35, 30)
,(35, 38, 33)
,(38, 41, 30)
,(43, 46, 40)
,(47, 50, 50)

I have tried to find all the cases where the periods overlap to determine the start and end of the period as well as the score. For scenarios where there is no overlap the period is difficult to select. I have taken the union for all the scenarios, which is quite a lot. I am therefore looking for an elegant solution to split the timelines into periods.

The query below first produces the total value range with a recursive cte , then generates the intervals with maximum score:

with cte(s, e) as (
  select (select min(least(a.time_start, b.time_start)) from #TABLE_A a 
          cross join #TABLE_B b), 
         (select max(greatest(a.time_end, b.time_end)) from #TABLE_A a 
          cross join #TABLE_B b)
  union all
  select c.s+1, c.e from cte c where c.s < e
),
intervals(s, v) as (
  select c.s, max(greatest(coalesce(a.score, 0), coalesce(b.score, 0))) v 
  from cte c left join #TABLE_A a on a.time_start <= c.s and c.s <= a.time_end 
  left join #TABLE_B b on b.time_start <= c.s and c.s <= b.time_end
  group by c.s
),
blocks(r, s, e, v, o, i, k) as (
  select t1.r, min(t1.s), max(t1.s), max(t1.v), min(t1.s), 0, 1 from (
     select i.*, (select count(*) from intervals i1 where i1.s < i.s and i1.v != i.v) r 
     from intervals i) t1
  group by t1.r
  union all
  select b.r, b.s, b.e, b.v, b.o + 1, case when b.k%5 = 0 then b.i + 1 else b.i end, b.k + 1 
  from blocks b where b.o < b.e
)
select t2.s, t2.e, t2.v from (select b.r, b.i, min(b.o) s, max(b.o) e, max(b.v) v from blocks b 
group by b.r, b.i) t2 
where t2.v != 0 order by t2.r

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM