简体   繁体   中英

Group records with multiple sets in sql

I have to optimize the select query to get less no. of records when we have same data but with different dates. I need to sort all the rows by date and should combine all the rows into a single until it finds a different column value. Typically the data will be as below.

date       c_val
1/1/2016    200
2/1/2016    200
3/1/2016    300
4/1/2016    300
5/1/2016    300
6/1/2016    200
7/1/2016    200

Then my output should be as follows.

start_date  end_date    c_val
1/1/2016    2/1/2016    200
3/1/2016    5/1/2016    300
6/1/2016    7/1/2016    200

The query that I followed for now is like this:

select min(date) as start_date, max(date) as end_date, c_val
from t_ord
group by c_val;

But this is actually returning only two records as it is grouping with c_val. I think i need to additional over to order and break when finds new value. Is there any feature available in postgres?

You can use a difference of row numbers approach to classify consecutive rows (ordered by date) with the same c_val into one group and start over a new group when a new value is encountered. After this is done, get the min and max date of each group per c_val.

select min(date) as startdate,max(date) as enddate,c_val
from (select c_val,date,row_number() over(order by date)
                        -row_number() over(partition by c_val order by date) as grp
      from t_ord
     ) t
group by c_val,grp;
  1. You can use lag window function on your value to obtain differences within following row (column change ).

  2. Then feed that onto sum function as a window function to make groups of values (column gr ).

  3. Having your groups of sequence of unchanged values you can group by it and the value itself and get the minimum and maximum date for each group.

Below is the query:

select 
  min(date) as start_date, max(date) as end_date, c_val 
from (
  select 
    c_val, sum(change) over (order by date) as gr, date
  from (
    select
      c_val,
      case when lag(c_val) over (order by date) <> c_val then 1 else 0 end as change,
      date
    from t_ord
    ) seq_change
  ) groups_of_values
group by c_val,gr
order by start_date;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM