简体   繁体   English

在SQL中使用多组对记录进行分组

[英]Group records with multiple sets in sql

I have to optimize the select query to get less no. 我必须优化选择查询以获得更少的否。 of records when we have same data but with different dates. 具有相同数据但具有不同日期的记录数。 I need to sort all the rows by date and should combine all the rows into a single until it finds a different column value. 我需要按日期对所有行进行排序,并且应该将所有行合并为一个,直到找到不同的列值为止。 Typically the data will be as below. 通常,数据如下。

date       c_val
1/1/2016    200
2/1/2016    200
3/1/2016    300
4/1/2016    300
5/1/2016    300
6/1/2016    200
7/1/2016    200

Then my output should be as follows. 然后我的输出应该如下。

start_date  end_date    c_val
1/1/2016    2/1/2016    200
3/1/2016    5/1/2016    300
6/1/2016    7/1/2016    200

The query that I followed for now is like this: 我现在遵循的查询是这样的:

select min(date) as start_date, max(date) as end_date, c_val
from t_ord
group by c_val;

But this is actually returning only two records as it is grouping with c_val. 但这实际上只返回两个记录,因为它与c_val分组。 I think i need to additional over to order and break when finds new value. 我认为我需要额外购买才能在发现新价值时中断。 Is there any feature available in postgres? postgres中有可用的功能吗?

You can use a difference of row numbers approach to classify consecutive rows (ordered by date) with the same c_val into one group and start over a new group when a new value is encountered. 您可以使用不同的行号方法将具有相同c_val的连续行(按日期排序)分类为一组,并在遇到新值时从新组开始。 After this is done, get the min and max date of each group per c_val. 完成此操作后,获取每个c_val中每个组的minmax日期。

select min(date) as startdate,max(date) as enddate,c_val
from (select c_val,date,row_number() over(order by date)
                        -row_number() over(partition by c_val order by date) as grp
      from t_ord
     ) t
group by c_val,grp;
  1. You can use lag window function on your value to obtain differences within following row (column change ). 您可以对值使用lag窗口函数,以获取下一行中的差异(列change )。

  2. Then feed that onto sum function as a window function to make groups of values (column gr ). 然后将其作为窗口函数加到sum函数上,以生成值组(列gr )。

  3. Having your groups of sequence of unchanged values you can group by it and the value itself and get the minimum and maximum date for each group. 将您的一组序列的值保持不变,您可以将其和值本身进行分组,并获得每个组的最小和最大日期。

Below is the query: 下面是查询:

select 
  min(date) as start_date, max(date) as end_date, c_val 
from (
  select 
    c_val, sum(change) over (order by date) as gr, date
  from (
    select
      c_val,
      case when lag(c_val) over (order by date) <> c_val then 1 else 0 end as change,
      date
    from t_ord
    ) seq_change
  ) groups_of_values
group by c_val,gr
order by start_date;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM