简体   繁体   中英

Postgresql - min/max date range within group

We have a transactional table that stores the data much like a historical table where any time a status changes (or other attributes) it gets effective dated.

Example:

  Product | Status   | Start Date | End Date
----------+------- --+------------+-----------
widget a  | active   | 02/01/2020 | 02/30/2020
widget a  | active   | 03/01/2020 | 03/19/2020
widget a  | inactive | 03/20/2020 | 05/01/2020
widget a  | active   | 05/02/2020 | 08/31/2020
widget b  | active   | 02/01/2020 | 05/31/2020
widget b  | inactive | 06/01/2020 | 06/31/2020

I am trying to roll up this data based on the min and max dates as the status changes (as I said, other attributes contribute to the record changing, but I am only concerned with status changes). So in above example, 'widget a' would have three records: active from 02/01/2020 - 03/19/2020, inactive from 03/20/2020 - 05/01/2020 and active from 05/02/2020 - 08/31/2020. This can easily be done using an ETL tool but I would like to get this into a view.

What is the best way to do this while being mindful of performance

This is postgresql 10

This is a gaps-and-islands problem, where you want to group together adjacent rows that have the same product and status.

Here is an approach that uses the difference between row numbers to build the groups:

select product, status, min(start_date) start_date, max(end_date) end_date
from (
    select t.*, 
        row_number() over(partition by product order by start_date) rn1,
        row_number() over(partition by product, status order by start_date) rn2
    from mytable t
) t
group by product, rn1 - rn2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM