[英]T-SQL Query to Identify Date Ranges when an Event Happens
I am trying to identify the date ranges when an organization is on our "monitor" list. 我正在尝试确定组织在“监视器”列表中的日期范围。
My data looks like this: 我的数据如下:
OrgCode OrgName ReviewDate MonitorList
8000 Organization A 3/6/2014 1
8000 Organization A 6/4/2014 1
8000 Organization A 9/4/2014 1
8000 Organization A 12/4/2014 0
8000 Organization A 3/5/2015 1
8000 Organization A 6/4/2015 1
8000 Organization A 9/16/2015 1
8000 Organization A 12/16/2015 1
8000 Organization A 3/9/2016 1
8000 Organization A 6/2/2016 1
8000 Organization A 9/8/2016 1
8000 Organization A 12/8/2016 1
8000 Organization A 3/9/2017 0
8000 Organization A 6/14/2018 0
The query output I'm seeking looks like this: 我正在寻找的查询输出如下所示:
OrgCode OrgName MonitorStartDate MonitorEndDate
8000 Organization A 3/6/2014 12/4/2014
8000 Organization A 3/5/2015 3/9/2017
This organization, Organization A, has appeared on our monitor list twice: from 3/6/2014 to 12/4/2014, and 3/5/2015 to 3/9/2017. 该组织A组织已在我们的监控列表中出现过两次:2014年3月6日至2014年4月4日,以及2015年3月5日至2017年3月9日。
I've tried to accomplish this in a few ways, including, 我试图通过几种方式实现这一目标,包括
LEAD()
and LAG()
; LEAD()
和LAG()
; and, GROUP BY OrgCode, OrgName, MonitorList
and defining MonitorStartDate as MIN(ReviewDate)
and MonitorEndDate as MAX(ReviewDate)
. GROUP BY OrgCode, OrgName, MonitorList
并将MonitorStartDate定义为MIN(ReviewDate)
,将MonitorEndDate定义为MAX(ReviewDate)
。 The second method did not account for the fact that these organizations may be on/off the monitor list multiple times. 第二种方法没有考虑到这些组织可能多次打开/关闭监视器列表的事实。 I still think some combinations of
LEAD()
or LAG()
might work; 我仍然认为
LEAD()
或LAG()
某些组合可能有效; but, not by themselves. 但是,不是他们自己。
Any guidance you folks can provide would be great and thanks for the help! 您所提供的任何指导都会很棒,感谢您的帮助!
Use a running sum to classify rows into groups re-setting the value when 0
is encountered and lead
to get the next row's date because the end date has to be from the first 0 encountered. 使用运行总和将行分类为组,在遇到
0
时重新设置值,并lead
获取下一行的日期,因为结束日期必须来自遇到的前0。 Then use min
and max
on the corresponding columns with necessary groupings. 然后在相应的列上使用
min
和max
以及必要的分组。
select orgcode,orgname
,min(case when monitorlist=1 then reviewdate end) as monitorstartdate
,max(next_dt) as monitorenddate
from (select t.*,
sum(case when monitorlist=0 then 1 else 0 end) over(partition by orgcode order by reviewdate) as grp,
lead(reviewdate) over(partition by orgcode order by reviewdate) as next_dt
from tbl t
) t
group by orgcode,orgname,grp
having max(cast(monitorlist as int))=1
With this query 有了这个查询
select orgcode,orgname,format(min(reviewdate),'M/d/yyyy') as monitorstartdate,format(max(next_dt),'M/d/yyyy') as monitorenddate
from (select t.*,
sum(case when monitorlist=0 then 1 else 0 end)
over(partition by orgcode order by reviewdate) as grp,
lead(reviewdate) over(partition by orgcode order by reviewdate) as next_dt
from tbl t
) t
group by orgcode,orgname,grp,MonitorList
having MonitorList = 1
the result is as follows 结果如下
orgcode orgname monitorstartdate monitorenddate
8000 "Organization A" 3/6/2014 12/4/2014
8000 "Organization A" 3/5/2015 3/9/2017
The Fiddle link is here if people want to verify. 如果人们想要验证,那么Fiddle链接就在这里 。
You can identify the groups by counting the number of 0's on or after each row. 您可以通过计算每行上或之后的0的数量来识别组。 The rest is just aggregation:
其余的只是聚合:
select orgcode, orgname, min(ReviewDate) as MonitorStartDate,
coalesce(min(case when monitorlist = 0 then ReviewDate end),
max(ReviewDate)
) as MontiroEndDate
from (select t.*,
sum(case when monitorlist = 0 then 1 else 0 end) over (partition by orgcode order by reviewdate desc) as grp
from t
) t
group by orgcode, orgname, grp
having max(monitorlist) = 1;
The logic for the end date is aa bit tricky: 结束日期的逻辑有点棘手:
ReviewDate
of the "0" record. ReviewDate
。 ReviewDate
is used. ReviewDate
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.