简体   繁体   中英

Consolidate consecutive rows, maybe using SQL Lead/Lag

I am trying to simplify a table containing all changes to only containing changes in caseworker.

My initial code was something like:

SELECT
      id_no
    , caseworker
    , MIN(Start_Date)
    , MAX(End_Date)
FROM Table
GROUP BY
      id_no
    , caseworker

This works in most cases, but not in cases like this one, where the same caseworker appears in two disjoint time periods. Any ideas on how to find "local" min and max?

I can't see if my picture is visible to you guys, but my table is something like this:

Id_no Caseworker Start End
1 None 2014-04-10 2020-02-17
1 KUN 2020-02-17 2020-03-19
1 KUN 2020-03-19 2020-03-21
1 KUN 2020-03-21 2020-03-31
1 KJE 2020-03-31 2020-04-22
1 KUN 2020-04-22 2021-12-02

The following should work. The query marks all rows where caseworker changed. The marks are then used to create groups so for example in your data the groups would be row (1), (2, 3, 4), (5) and (6):

with cte1 as (
    select *, case when lag(caseworker) over (partition by id_no order by start) = caseworker then 0 else 1 end as chg
    from t
), cte2 as (
    select *, sum(chg) over (partition by id_no order by start) as grp
    from cte1
)
select id_no, caseworker, min(start), max([end])
from cte2
group by id_no, grp, caseworker
order by id_no, grp

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM