简体   繁体   English

SQL:对记录进行分组以进行路径

[英]SQL: Grouping records for pathing

My SQL is a little rusty, thus the question, but it also means I am a little unsure how to frame the question exactly. 我的SQL有点生疏,因此还是个问题,但这也意味着我不确定如何准确地构造问题。 I hope the example below explains it better. 我希望下面的例子能更好地解释它。

I have a list of records, by user ID: 我有一个记录列表,按用户ID:

ID | Channel  | FiscalPeriod | Amount
======================================
A  | Online   | 201710       | 20.00
A  | Online   | 201709       | 20.00
A  | Voucher  | 201708       | 20.00
A  | Voucher  | 201707       | 20.00
A  | Voucher  | 201706       | 20.00
A  | Online   | 201705       | 20.00
A  | Online   | 201704       | 20.00

I need to group the "channel" field, but not as a complete group, as the ordering by fiscal period is important, so the result set would look like the following: 我需要对“渠道”字段进行分组,而不是将其分组为一个完整的分组,因为按会计期间排序很重要,因此结果集如下所示:

ID | Channel  | MAX(FiscalPeriod) | Amount
==========================================
A  | Online   | 201710            | 40.00
A  | Voucher  | 201708            | 60.00
A  | Online   | 201705            | 40.00

For pathing the output ordering above is really important, so I can see that online << voucher << online was the user path 对于路径,上面的输出顺序非常重要,因此我可以看到,在线<<凭证<<在线是用户路径

You can use a difference of row numbers approach to classify consecutive same channels to a group (run the inner query to check this) and then use it to get sum and max. 您可以使用不同的行号方法将连续的相同通道分类到组中(运行内部查询以进行检查),然后使用它来获取总和和最大值。

select distinct id,channel
,max(fiscalperiod) over(partition by id,channel,grp) as max_fiscalperiod
,sum(amount) over(partition by id,channel,grp) as amount
from (select t.*,
      row_number() over(partition by id order by fiscalperiod)
      -row_number() over(partition by id,channel order by fiscalperiod) as grp
      from tbl t
     ) t

I would approach this using the difference of row numbers approach with aggregation: 我将使用行数差异方法和聚合方法来解决此问题:

select id, channel
       max(fiscalperiod) as max_fiscalperiod
       sum(amount) as amount
from (select t.*,
             row_number() over (partition by id order by fiscalperiod) as seqnum_i,
             row_number() over (partition by id, channel order by fiscalperiod) seqnum_ic
      from t
     ) t
group by id, channel, (seqnum_i - seqnum_ic);

To understand why the difference of row numbers works is a little cumbersome. 要了解为什么行号差异起作用的原因有点麻烦。 However, if you stare at the results from the subquery, you'll see why the difference defines the groups that you want. 但是,如果您盯着子查询的结果,则会看到为什么差异定义了所需的组。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM