简体   繁体   中英

T-SQL Query to Identify Date Ranges when an Event Happens

I am trying to identify the date ranges when an organization is on our "monitor" list.

My data looks like this:

OrgCode OrgName           ReviewDate    MonitorList
8000    Organization A    3/6/2014      1
8000    Organization A    6/4/2014      1
8000    Organization A    9/4/2014      1
8000    Organization A    12/4/2014     0
8000    Organization A    3/5/2015      1
8000    Organization A    6/4/2015      1
8000    Organization A    9/16/2015     1
8000    Organization A    12/16/2015    1
8000    Organization A    3/9/2016      1
8000    Organization A    6/2/2016      1
8000    Organization A    9/8/2016      1
8000    Organization A    12/8/2016     1
8000    Organization A    3/9/2017      0
8000    Organization A    6/14/2018     0

The query output I'm seeking looks like this:

OrgCode OrgName           MonitorStartDate  MonitorEndDate
8000    Organization A    3/6/2014          12/4/2014
8000    Organization A    3/5/2015          3/9/2017

This organization, Organization A, has appeared on our monitor list twice: from 3/6/2014 to 12/4/2014, and 3/5/2015 to 3/9/2017.

I've tried to accomplish this in a few ways, including,

  • Varieties of LEAD() and LAG() ; and,
  • GROUP BY OrgCode, OrgName, MonitorList and defining MonitorStartDate as MIN(ReviewDate) and MonitorEndDate as MAX(ReviewDate) .

The second method did not account for the fact that these organizations may be on/off the monitor list multiple times. I still think some combinations of LEAD() or LAG() might work; but, not by themselves.

Any guidance you folks can provide would be great and thanks for the help!

Use a running sum to classify rows into groups re-setting the value when 0 is encountered and lead to get the next row's date because the end date has to be from the first 0 encountered. Then use min and max on the corresponding columns with necessary groupings.

select orgcode,orgname
,min(case when monitorlist=1 then reviewdate end) as monitorstartdate
,max(next_dt) as monitorenddate
from (select t.*,
      sum(case when monitorlist=0 then 1 else 0 end) over(partition by orgcode order by reviewdate) as grp,
      lead(reviewdate) over(partition by orgcode order by reviewdate) as next_dt
      from tbl t
     ) t
group by orgcode,orgname,grp
having max(cast(monitorlist as int))=1

With this query

select orgcode,orgname,format(min(reviewdate),'M/d/yyyy') as monitorstartdate,format(max(next_dt),'M/d/yyyy') as monitorenddate
from (select t.*,
   sum(case when monitorlist=0 then 1 else 0 end) 
     over(partition by orgcode order by reviewdate) as grp,
     lead(reviewdate) over(partition by orgcode order by reviewdate) as next_dt
   from tbl t
   ) t
group by orgcode,orgname,grp,MonitorList
having MonitorList = 1

the result is as follows

orgcode     orgname             monitorstartdate    monitorenddate
8000        "Organization A"    3/6/2014            12/4/2014
8000        "Organization A"    3/5/2015            3/9/2017

The Fiddle link is here if people want to verify.

You can identify the groups by counting the number of 0's on or after each row. The rest is just aggregation:

select orgcode, orgname, min(ReviewDate) as MonitorStartDate,
       coalesce(min(case when monitorlist = 0 then ReviewDate end),
                max(ReviewDate)
               ) as MontiroEndDate
from (select t.*,
             sum(case when monitorlist = 0 then 1 else 0 end) over (partition by orgcode order by reviewdate desc) as grp             
      from t
     ) t
group by orgcode, orgname, grp
having max(monitorlist) = 1;

The logic for the end date is aa bit tricky:

  • It is the ReviewDate of the "0" record.
  • If there are none, the latest ReviewDate is used.

Here is a SQL Fiddle demonstrating it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM