繁体   English   中英

SQL 按列分组但基于另一列分段

[英]SQL group by an column but segmented based on another column

我有一张包含大约 100000 多行和 3 列的表:

  • 帐号
  • 报告日期
  • 未清金额

我需要找到一个报表,按帐户对未清金额进行分组,但还要根据日期进行削减。 1 个帐户的示例数据:

+----------------+-------------+--------------------+--+
| account_number | report_date | outstanding_amount |  |
+----------------+-------------+--------------------+--+
|              1 | 02/01/2019  |                100 |  |
|              1 | 03/01/2019  |                100 |  |
|              1 | 06/01/2019  |                200 |  |
|              1 | 07/01/2019  |                300 |  |
|              1 | 10/01/2019  |                200 |  |
|              1 | 11/01/2019  |                200 |  |
|              1 | 12/01/2019  |                100 |  |
+----------------+-------------+--------------------+--+    

所以如果我运行这个语句:

select * from (select account_number, min(report_date) mindate, max(report_date) maxdate, outstading_amount from table1 grouped by account_number, outstanding_amount)

此语句的结果应类似于:

+----------------+------------+------------+--------------------+
| account_number |  mindate   |  maxdate   | outstanding_amount |
+----------------+------------+------------+--------------------+
|              1 | 02/01/2019 | 12/01/2019 |                100 |
|              1 | 06/01/2019 | 11/01/2019 |                200 |
|              1 | 07/01/2019 | 07/01/2019 |                300 |
+----------------+------------+------------+--------------------+

所以在这里我想将结果分开,以便一行的 mindate 和 maxdate 之间的天数不会与下一行的天数重叠。 我正在寻找的结果是这样的:

+----------------+------------+------------+--------------------+
| account_number |  mindate   |  maxdate   | outstanding_amount |
+----------------+------------+------------+--------------------+
|              1 | 02/01/2019 | 03/01/2019 |                100 |
|              1 | 06/01/2019 | 06/01/2019 |                200 |
|              1 | 07/01/2019 | 07/01/2019 |                300 |
|              1 | 10/01/2019 | 11/01/2019 |                200 |
|              1 | 12/01/2019 | 12/01/2019 |                100 |
+----------------+------------+------------+--------------------+

是否可以构造此语句?

要扁平化数据,请按计算的等级压缩它。

select account_number
, min(report_date) as mindate
, max(report_date) as maxdate
, outstanding_amount
from
(
    select q1.*
    , sum(flag) over (partition by account_number order by report_date) as rnk
    from
    (
        select t.*
        , case when outstanding_amount = lag(outstanding_amount, 1) over (partition by account_number order by report_date) then 0 else 1 end as flag
        from table1 t
    ) q1
) q2
group by account_number, outstanding_amount, rnk
order by account_number, mindate;

db<>fiddle 的测试在这里

这是一个缺口和孤岛问题。 在这种情况下,最简单的解决方案可能是行号的差异:

select account_number, outstanding_amount,
       min(report_date), max(report_date)
from (select t.*,
             row_number() over (partition by account_number order by report_date) as seqnum,
             row_number() over (partition by account_number, outstanding_amount order by report_date) as seqnum_o
      from t
     ) t
group by account_number, outstanding_amount, (seqnum - seqnum_o)
order by account_number, min(report_date);

为什么这行得通有点难以解释。 但是如果您查看子查询的结果,您将能够看到行号的差异如何定义具有相同数量的相邻行。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM