簡體   English   中英

SQL Server 如何按連續記錄分組

[英]How to group by consecutive records SQL Server

我有這個表,我需要按 [id_2] 字段中的連續記錄進行分組:

數據集:

id_1 id_2 datemin            datemax

1    0    2019-01-01 10:14   2019-01-01 15:20
1    1    2019-01-01 15:21   2019-01-01 16:01
1    0    2019-01-01 16:02   2019-01-01 16:08
1    1    2019-01-01 16:09   2019-01-01 16:40
1    1    2019-01-01 16:41   2019-01-01 17:50
1    1    2019-01-01 17:51   2019-01-01 18:36
1    0    2019-01-01 18:36   2019-01-01 19:07
1    1    2019-01-01 19:08   2019-01-01 22:01
1    0    2019-01-01 22:02   2019-01-01 22:47
1    1    2019-01-01 22:47   2019-01-01 23:05
1    0    2019-01-01 23:06   2019-01-01 23:59

預期結果:

id_1 id_2 datemin            datemax

1    0    2019-01-01 10:14   2019-01-01 15:20
1    1    2019-01-01 15:21   2019-01-01 16:01
1    0    2019-01-01 16:02   2019-01-01 16:08
1    1    2019-01-01 16:09   2019-01-01 18:36
1    0    2019-01-01 18:36   2019-01-01 19:07
1    1    2019-01-01 19:08   2019-01-01 22:01
1    0    2019-01-01 22:02   2019-01-01 22:47
1    1    2019-01-01 22:47   2019-01-01 23:05
1    0    2019-01-01 23:06   2019-01-01 23:59

必須考慮每個連續重復字段 [id_2] 的 datemin 和 datemax 進行分組

我嘗試了其他示例,但我根本不明白

非常感謝!

這是間隙和孤島問題的一個例子。 如果我假設時間框架平鋪在一起(即沒有間隙) - 或者你不關心間隙 - 那么最簡單的方法可能是行號的不同:

select id_1, id_2, min(date_min), max(date_max)
from (select t.*,
             row_number() over (partition by id_1 order by date_min) as seqnum,
             row_number() over (partition by id_1, id_2 order by date_min) as seqnum_2
      from t
     ) t
group by id_1, id_2, (seqnum - seqnum_2);

為什么這行得通有點難以解釋。 但是如果您查看子查詢的結果,您應該會看到兩個行號之間的差異如何定義您要查找的組。

這有點棘手,但結合領先、滯后和行號是可以實現的。

您需要提前和滯后以確保將前一行與當前行進行比較,反之亦然。 然后需要行號來分配唯一編號,以便 group by 將適用於連續情況。 此外,當有匹配時,我在 case 語句中添加 -99,這樣它就不會與行號結果發生沖突。 通過這樣做,並在子查詢的幫助下,這應該可以工作。

with cte as (
select 1 as ID_1, 0 as ID_2, cast('2019-01-01 10:14:00' as datetime) Datemin, cast('2019-01-01 15:20:00' as datetime) as Datemax union all 
select 1 as ID_1, 1 as ID_2, cast('2019-01-01 15:21:00' as datetime) Datemin, cast('2019-01-01 16:01:00' as datetime) as Datemax union all 
select 1 as ID_1, 0 as ID_2, cast('2019-01-01 16:02:00' as datetime) Datemin, cast('2019-01-01 16:08:00' as datetime) as Datemax union all 
select 1 as ID_1, 1 as ID_2, cast('2019-01-01 16:09:00' as datetime) Datemin, cast('2019-01-01 16:40:00' as datetime) as Datemax union all 
select 1 as ID_1, 1 as ID_2, cast('2019-01-01 16:41:00' as datetime) Datemin, cast('2019-01-01 17:50:00' as datetime) as Datemax union all 
select 1 as ID_1, 1 as ID_2, cast('2019-01-01 17:51:00' as datetime) Datemin, cast('2019-01-01 18:36:00' as datetime) as Datemax union all 
select 1 as ID_1, 0 as ID_2, cast('2019-01-01 18:36:00' as datetime) Datemin, cast('2019-01-01 19:07:00' as datetime) as Datemax union all 
select 1 as ID_1, 1 as ID_2, cast('2019-01-01 19:08:00' as datetime) Datemin, cast('2019-01-01 22:01:00' as datetime) as Datemax union all 
select 1 as ID_1, 0 as ID_2, cast('2019-01-01 22:02:00' as datetime) Datemin, cast('2019-01-01 22:47:00' as datetime) as Datemax union all 
select 1 as ID_1, 1 as ID_2, cast('2019-01-01 22:47:00' as datetime) Datemin, cast('2019-01-01 23:05:00' as datetime) as Datemax union all 
select 1 as ID_1, 0 as ID_2, cast('2019-01-01 23:06:00' as datetime) Datemin, cast('2019-01-01 23:59:00' as datetime) as Datemax  ) 

--select * from cte; 
select id_1, id_2, min(datemin) min_date, max(datemax) max_date from (
select *, case when lead(id_2) over (order by datemin, datemax) = ID_2
or lag(id_2) over (order by datemin, datemax)  = ID_2  then -99 else row_number () over (order by datemin, datemax) end  as Comparison  from cte) z 
group by id_1, id_2, Comparison 
order by min_date ;

輸出:

在此處輸入圖片說明

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM