I am trying to identify the date ranges when an organization is on our "monitor" list.
My data looks like this:
OrgCode OrgName ReviewDate MonitorList
8000 Organization A 3/6/2014 1
8000 Organization A 6/4/2014 1
8000 Organization A 9/4/2014 1
8000 Organization A 12/4/2014 0
8000 Organization A 3/5/2015 1
8000 Organization A 6/4/2015 1
8000 Organization A 9/16/2015 1
8000 Organization A 12/16/2015 1
8000 Organization A 3/9/2016 1
8000 Organization A 6/2/2016 1
8000 Organization A 9/8/2016 1
8000 Organization A 12/8/2016 1
8000 Organization A 3/9/2017 0
8000 Organization A 6/14/2018 0
The query output I'm seeking looks like this:
OrgCode OrgName MonitorStartDate MonitorEndDate
8000 Organization A 3/6/2014 12/4/2014
8000 Organization A 3/5/2015 3/9/2017
This organization, Organization A, has appeared on our monitor list twice: from 3/6/2014 to 12/4/2014, and 3/5/2015 to 3/9/2017.
I've tried to accomplish this in a few ways, including,
LEAD()
and LAG()
; and, GROUP BY OrgCode, OrgName, MonitorList
and defining MonitorStartDate as MIN(ReviewDate)
and MonitorEndDate as MAX(ReviewDate)
. The second method did not account for the fact that these organizations may be on/off the monitor list multiple times. I still think some combinations of LEAD()
or LAG()
might work; but, not by themselves.
Any guidance you folks can provide would be great and thanks for the help!
Use a running sum to classify rows into groups re-setting the value when 0
is encountered and lead
to get the next row's date because the end date has to be from the first 0 encountered. Then use min
and max
on the corresponding columns with necessary groupings.
select orgcode,orgname
,min(case when monitorlist=1 then reviewdate end) as monitorstartdate
,max(next_dt) as monitorenddate
from (select t.*,
sum(case when monitorlist=0 then 1 else 0 end) over(partition by orgcode order by reviewdate) as grp,
lead(reviewdate) over(partition by orgcode order by reviewdate) as next_dt
from tbl t
) t
group by orgcode,orgname,grp
having max(cast(monitorlist as int))=1
With this query
select orgcode,orgname,format(min(reviewdate),'M/d/yyyy') as monitorstartdate,format(max(next_dt),'M/d/yyyy') as monitorenddate
from (select t.*,
sum(case when monitorlist=0 then 1 else 0 end)
over(partition by orgcode order by reviewdate) as grp,
lead(reviewdate) over(partition by orgcode order by reviewdate) as next_dt
from tbl t
) t
group by orgcode,orgname,grp,MonitorList
having MonitorList = 1
the result is as follows
orgcode orgname monitorstartdate monitorenddate
8000 "Organization A" 3/6/2014 12/4/2014
8000 "Organization A" 3/5/2015 3/9/2017
The Fiddle link is here if people want to verify.
You can identify the groups by counting the number of 0's on or after each row. The rest is just aggregation:
select orgcode, orgname, min(ReviewDate) as MonitorStartDate,
coalesce(min(case when monitorlist = 0 then ReviewDate end),
max(ReviewDate)
) as MontiroEndDate
from (select t.*,
sum(case when monitorlist = 0 then 1 else 0 end) over (partition by orgcode order by reviewdate desc) as grp
from t
) t
group by orgcode, orgname, grp
having max(monitorlist) = 1;
The logic for the end date is aa bit tricky:
ReviewDate
of the "0" record. ReviewDate
is used. Here is a SQL Fiddle demonstrating it.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.