简体   繁体   中英

Check if date falls between previous row start date and end date

I have the following sample data:

id  Ref     User        Task    Refernce    Start   End         rn
1   12222   Joe Bloggs  Task 1  Ref001  24/02/2014  20/07/2014  1
2   12568   Joe Bloggs  Task 1  Ref001  25/07/2014  12/10/2014  2
3   14757   Joe Bloggs  Task 1  Ref001  29/10/2014  11/01/2015  3
4   12493   Joe Bloggs  Task 1  Ref001  7/01/2015   6/04/2015   4
5   13694   Joe Bloggs  Task 2  Ref001  3/04/2014   20/07/2014  1
6   85569   Joe Bloggs  Task 2  Ref001  18/07/2014  12/10/2014  2
7   54769   Joe Bloggs  Task 2  Ref001  24/11/2014  5/01/2015   3
8   89716   Joe Bloggs  Task 2  Ref001  12/01/2015  6/04/2015   4

I need to check where any of the Start/End dates are between the previous rn Start/End date where the Task is the same type.

In the above data the rows flagged as overlapping would be:

4   12493   Joe Bloggs  Task 1  Ref001  7/01/2015   6/04/2015   4

because the Start Date of 07/01/2015 overlaps rn 3 End Date of 11/01/15

6   85569   Joe Bloggs  Task 2  Ref001  18/07/2014  12/10/2014  2

because the because the Start Date of 18/07/2014 overlaps rn 1 End Date of 20/07/14

can someone please advise how this is possible without using a cursor?

You can do this using lag() in SQL Server 2012+. In SQL Server 2008, I would recommend a join :

select s.*,
       (case when s.start between s2.start and s2.end then 1 else 0 end) as flg
from sample s left outer join
     sample sprev
     on s.id = sprev.id + 1;

When working with dates and you want to check for overlapping intervals, the calculation is a bit more complicated. Consider that for any two intervals [I1, I2] with starting dates [S1, S2] and ending dates [E1, E2], the following shows all the ways they can overlap.

I1: S1|----------|E1
I2:         S2|---------|E2

I1:    S1|----|E1
I2: S2|---------|E2

I1: S1|----------|E1
I2:     S2|---|E2

I1:        S1|----------|E1
I2: S2|---------|E2

It's easier to show when they don't overlap

I1: S1|-------|E1
I2:         S2|-------|E2

I1:         S1|-------|E1
I2: S2|-------|E2

From this we can derive the equation for "no overlap":

E2 <= S1 OR S2 >= E1

Since you're interested in "yes overlap", just invert:

E2 > S1 and S2 < E1

Or, if you order the dates so you know that, say, S2 will always come after S1, the test can be simplified to

E2 > S1

Based on that, here is the query:

select  s1.*, s2.Ref, s2.EndDt PrevEndDt
from    source  s1
join    source  s2
    on  s2.Task     = s1.Task
    and s2.StartDt  < s1.StartDt  -- examine only previous dates
    and s2.EndDt    > s1.StartDt; -- test for overlap

This just shows the interval that overlaps.

Notice the query ignores the task sequence value rn (and ID) and instead looks only at the starting dates. This has several benefits:

  1. it makes the query work even if there are gaps in the rn sequence and
  2. it finds all overlaps (if any) in the previous intervals of the task not just the immediately preceeding one.
  3. it orders the start dates thus simplifies the overlap test. This is not really a big deal. Calculations are cheap and this doesn't change the execution plan and makes no discernible difference in the run time. But, hey, simpler code is simpler code.

One assumption this relies on is that there is a unique constraint so that there cannot be the same start date within a task. That would seem to be an obvious constraint to have since you don't want overlaps. Plus having an index on (Task, StartDt) is a sensible step.

SQLFiddle seems to be down at the moment. But my test script is not large.

create table Source(
  ID        int not null,
  Ref       int not null,
  UserName  varchar( 16 ) not null,
  Task      varchar( 16 ) not null,
  Refernce  varchar( 16 ) not null,
  StartDt   date not null,
  EndDt     date not null,
  rn        smallint not null,
  constraint PK_Source primary key( ID ),
  constraint CK_Source_StartEnd check( StartDt < EndDt)
);

insert into Source
    select  1, 12222, 'Joe Bloggs', 'Task 1', 'Ref001', '2014-02-24', '2014-07-20', 1 union all
    select  2, 12568, 'Joe Bloggs', 'Task 1', 'Ref001', '2014-07-25', '2014-10-12', 2 union all
    select  3, 14757, 'Joe Bloggs', 'Task 1', 'Ref001', '2014-10-29', '2015-01-11', 3 union all
    select  4, 12493, 'Joe Bloggs', 'Task 1', 'Ref001', '2015-01-07', '2015-04-06', 4 union all

    select  5, 13694, 'Joe Bloggs', 'Task 2', 'Ref001', '2015-04-03', '2015-07-20', 1 union all
    select  6, 85569, 'Joe Bloggs', 'Task 2', 'Ref001', '2015-09-24', '2015-10-12', 2 union all
    select  7, 54769, 'Joe Bloggs', 'Task 2', 'Ref001', '2015-07-18', '2015-09-20', 3 union all
    select  8, 89716, 'Joe Bloggs', 'Task 2', 'Ref001', '2016-01-12', '2016-04-06', 4;

create unique index UQ_Source_TaskStart on Source( Task, StartDt );

Notice I changed the dates for Task 2 around a bit. The second entry no longer overlaps with the first entry -- the third entry does. If this is not a possibility with your data, just set it back.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM