简体   繁体   English

如何编写查询以将行号(1到n)附加到每个组的每个记录

[英]How to write a query to attach rownumber(1 to n) to each records for each group

I have a dataset something like below 我有一个像下面的数据集

|date|flag|
|20190503|0|
|20190504|1|
|20190505|1|
|20190506|1|
|20190507|1|
|20190508|0|
|20190509|0|
|20190510|0|
|20190511|1|
|20190512|1|
|20190513|0|
|20190514|0|
|20190515|1|

What I want to achieve is to group the consecutive dates by flag=1, and add one column counter to mark 1 for the first day of the consecutive days where flag=1, and 2 for the 2nd day and etc, assign 0 for flag=0 我要实现的是将连续的日期按flag = 1进行分组,并在flag = 1的连续几天的第一天为标记1添加一个列计数器,对于第二天的第2天添加2列,等等,为flag分配0 = 0

|date|flag|counter|
|20190503|0|0|
|20190504|1|1|
|20190505|1|2|
|20190506|1|3|
|20190507|1|4|
|20190508|0|0|
|20190509|0|0|
|20190510|0|0|
|20190511|1|1|
|20190512|1|2|
|20190513|0|0|
|20190514|0|0|
|20190515|1|1|

I tried analytical function and hierarchy query, but still haven't found a solution, seeking help, any hint is appreciated! 我尝试了解析函数和层次结构查询,但是仍然没有找到解决方案,寻求帮助,任何提示都值得赞赏!

Thanks, Hong 谢谢你洪

You can define the groups using a cumulative sum of the zeros. 您可以使用零的累加和定义组。 Then use row_number() : 然后使用row_number()

select t.*,
       (case when flag = 0 then 0
             else row_number() over (partition by grp order by date)
        end) as counter
from (select t.*,
             sum(case when flag = 0 then 1 else 0 end) over (order by date) as grp
      from t
     ) t;

A very different approach is to take the difference between the current date and a cumulative max of the flag = 0 date: 一种非常不同的方法是采用当前日期与flag = 0 date的累积最大值之间的差值:

select t.*,
       datediff(day,
                max(case when flag = 0 then date end) over (order by date),
                date
               ) as counter
from t;

Note that the logic of these two approaches is different -- although they should produce the same results for the data you have provided. 请注意,这两种方法的逻辑是不同的-尽管它们对于所提供的数据应该产生相同的结果。 For missing dates, the first just ignores missing dates. 对于丢失的日期,第一个只是忽略丢失的日期。 The second will increment the counter for missing dates. 第二秒钟将增加缺少日期的计数器。

Well - Vertica has a very nice CONDITIONAL_CHANGE_EVENT() function that could help you there ... 很好-Vertica有一个非常不错的CONDITIONAL_CHANGE_EVENT()函数,可以帮助您...

Everytime the expression between the brackets changes, an integer is incremented by 1. This gives you a new group identifier, or a criterion to PARTITION BY, every time the flag changes. 括号之间的表达式每次更改时,整数都会增加1。每次flag更改时,都会为您提供新的组标识符或PARTITION BY的条件。 So one SELECT to get the grouping info, and then partition by the obtained grouping info. 因此,一个SELECT即可获取分组信息,然后按所获得的分组信息进行分区。 Here goes: 开始:

WITH
input(dt,flag) AS (
          SELECT '2019-05-03'::DATE,0
UNION ALL SELECT '2019-05-04'::DATE,1
UNION ALL SELECT '2019-05-05'::DATE,1
UNION ALL SELECT '2019-05-06'::DATE,1
UNION ALL SELECT '2019-05-07'::DATE,1
UNION ALL SELECT '2019-05-08'::DATE,0
UNION ALL SELECT '2019-05-09'::DATE,0
UNION ALL SELECT '2019-05-10'::DATE,0
UNION ALL SELECT '2019-05-11'::DATE,1
UNION ALL SELECT '2019-05-12'::DATE,1
UNION ALL SELECT '2019-05-13'::DATE,0
UNION ALL SELECT '2019-05-14'::DATE,0
UNION ALL SELECT '2019-05-15'::DATE,1
)
,
grp_input AS (
SELECT
*
, CONDITIONAL_CHANGE_EVENT(flag) OVER(ORDER BY dt) AS grp
FROM input
)
SELECT
dt
, flag
, CASE FLAG
WHEN 0 THEN 0
ELSE ROW_NUMBER() OVER(PARTITION BY grp ORDER BY dt)
END AS counter
FROM grp_input;
-- out      dt     | flag | counter 
-- out ------------+------+---------
-- out  2019-05-03 |    0 |       0
-- out  2019-05-04 |    1 |       1
-- out  2019-05-05 |    1 |       2
-- out  2019-05-06 |    1 |       3
-- out  2019-05-07 |    1 |       4
-- out  2019-05-08 |    0 |       0
-- out  2019-05-09 |    0 |       0
-- out  2019-05-10 |    0 |       0
-- out  2019-05-11 |    1 |       1
-- out  2019-05-12 |    1 |       2
-- out  2019-05-13 |    0 |       0
-- out  2019-05-14 |    0 |       0
-- out  2019-05-15 |    1 |       1
-- out (13 rows)
-- out 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM