Postgresql - min/max date range within group

Question

We have a transactional table that stores the data much like a historical table where any time a status changes (or other attributes) it gets effective dated.

Example:

  Product | Status   | Start Date | End Date
----------+------- --+------------+-----------
widget a  | active   | 02/01/2020 | 02/30/2020
widget a  | active   | 03/01/2020 | 03/19/2020
widget a  | inactive | 03/20/2020 | 05/01/2020
widget a  | active   | 05/02/2020 | 08/31/2020
widget b  | active   | 02/01/2020 | 05/31/2020
widget b  | inactive | 06/01/2020 | 06/31/2020

I am trying to roll up this data based on the min and max dates as the status changes (as I said, other attributes contribute to the record changing, but I am only concerned with status changes). So in above example, 'widget a' would have three records: active from 02/01/2020 - 03/19/2020, inactive from 03/20/2020 - 05/01/2020 and active from 05/02/2020 - 08/31/2020. This can easily be done using an ETL tool but I would like to get this into a view.

What is the best way to do this while being mindful of performance

This is postgresql 10

Answer 1

This is a gaps-and-islands problem, where you want to group together adjacent rows that have the same product and status.

Here is an approach that uses the difference between row numbers to build the groups:

select product, status, min(start_date) start_date, max(end_date) end_date
from (
    select t.*, 
        row_number() over(partition by product order by start_date) rn1,
        row_number() over(partition by product, status order by start_date) rn2
    from mytable t
) t
group by product, rn1 - rn2

Postgresql - min/max date range within group

Question

1 answers

solution1
1 ACCPTED 2020-10-14 21:26:57

Postgresql - min/max date range within group

Question

1 answers

solution1 1 ACCPTED 2020-10-14 21:26:57

solution1
1 ACCPTED 2020-10-14 21:26:57