简体   繁体   English

SQL中的序列编号

[英]Sequence Numbering in SQL

I have a list of activities(A-start to H-end) for certain events. 我列出了某些活动的活动清单(A-start to H-end)。 These can occur in any order, any number of times and can restart as well. 这些可以以任何顺序发生,任何次数也可以重新启动。 I need to identify the blocks of activities within an event. 我需要确定事件中的活动块。

Eg: A BCDEFG H BCD H CDEF H EFG H 例如: A BCDEFG H BCD H CDEF H EFG H.

It starts only once (A) but ends multiple times 它只启动一次(A)但多次结束

Need to number these activities to identify sets (How many times it ended) 需要对这些活动进行编号以识别集合(结束了多少次)

output: 1 1 1 1 1 1 1 2 2 2 2 3 3 3 3 3 4 4 4 4 5

This helps me identify that the event ended (H) 5-1 = 4 times 这有助于我确定事件结束(H)5-1 = 4次

It looks like you want to count the number of "H"s and "A"s before a given value. 看起来你想在给定值之前计算“H”和“A”的数量。 This requires having a column that specifies the ordering. 这需要一个指定排序的列。 Let me assume this column is called id . 我假设这个列叫做id

Then, you can do this with window functions: 然后,您可以使用窗口函数执行此操作:

select t.*,
       sum(case when col = 'H' then 1 else 0 end) over (partition by grp order by id) + 1 as output
from (select t.*,
             sum(case when col = 'A' then 1 else 0 end) over (order by id) as grp
      from t
     ) t;

The subquery defines the "activity" groups, by doing a cumulative sum of "A"s. 子查询通过执行“A”的累积和来定义“活动”组。 The outer query then defines the "event" groups by doing a cumulative sum of "E"s. 外部查询然后通过执行“E”的累积和来定义“事件”组。

To be honest, I cannot tell if the "H" is part of the preceding value or the next value. 说实话,我不知道“H”是否是前一个值或下一个值的一部分。 If the next value then the query can use a window clause or a slight tweak to the logic: 如果是下一个值,则查询可以使用window子句或稍微调整逻辑:

       (sum(case when col = 'H' then 1 else 0 end) over (partition by grp order by id) +
        (case when col = 'H' then 0 else 1 end)
       ) as output

If your events are series of events in time - try to play with the MATCH() clause and its dependent functions event_name() , pattern_id() and match_id() . 如果你的事件是一系列事件的时间-尝试与发挥MATCH()条款及其相关职能event_name() pattern_id()match_id()

I just created a timeseries out of your input letters, spaced by one-hour intervals, and applied a MATCH() clause. 我刚刚用输入字母创建了一个时间序列,间隔一个小时,并应用了一个MATCH()子句。 If the PATTERN pat AS () clause uncannily reminds you of a grep expression, that's the way it works. 如果PATTERN pat AS ()子句不可思议地提醒你一个grep表达式,那就是它的工作方式。

Just look at the query's output - and imagine how many interesting things you could do with the pattern_id -s and the match_id -s that you get- grouping by them, for example, in subsequent SELECT-s ... 试想一下,在查询的输出-和想象你能有多少有趣的事情与做pattern_id -s和match_id -s,你GET-他们分组,例如,在随后的SELECT-S ...

WITH 
s(tm,event) AS (
          SELECT TIME '00:00:00','A'
UNION ALL SELECT TIME '01:00:00','B'
UNION ALL SELECT TIME '02:00:00','C'
UNION ALL SELECT TIME '03:00:00','D'
UNION ALL SELECT TIME '04:00:00','E'
UNION ALL SELECT TIME '05:00:00','F'
UNION ALL SELECT TIME '06:00:00','G'
UNION ALL SELECT TIME '07:00:00','H'
UNION ALL SELECT TIME '08:00:00','B'
UNION ALL SELECT TIME '09:00:00','C'
UNION ALL SELECT TIME '10:00:00','D'
UNION ALL SELECT TIME '11:00:00','H'
UNION ALL SELECT TIME '12:00:00','C'
UNION ALL SELECT TIME '13:00:00','D'
UNION ALL SELECT TIME '14:00:00','E'
UNION ALL SELECT TIME '15:00:00','F'
UNION ALL SELECT TIME '16:00:00','H'
UNION ALL SELECT TIME '17:00:00','E'
UNION ALL SELECT TIME '18:00:00','F'
UNION ALL SELECT TIME '19:00:00','G'
UNION ALL SELECT TIME '20:00:00','H'
)
SELECT
  *
, event_name()
, pattern_id()
, match_id()
FROM s
MATCH(
  PARTITION BY 1 -- nothing to partition by
  ORDER BY tm
  DEFINE  
    START_ev AS (event='A')
  , any_ev   AS (event NOT IN ('A','H'))
  , END_ev   AS (event='H')
  PATTERN pat AS (start_ev* any_ev+ end_ev)
  ROWS MATCH FIRST EVENT
);

tm      |event|event_name|pattern_id|match_id
00:00:00|A    |START_ev  |         1|       1
01:00:00|B    |any_ev    |         1|       2
02:00:00|C    |any_ev    |         1|       3
03:00:00|D    |any_ev    |         1|       4
04:00:00|E    |any_ev    |         1|       5
05:00:00|F    |any_ev    |         1|       6
06:00:00|G    |any_ev    |         1|       7
07:00:00|H    |END_ev    |         1|       8
08:00:00|B    |any_ev    |         2|       1
09:00:00|C    |any_ev    |         2|       2
10:00:00|D    |any_ev    |         2|       3
11:00:00|H    |END_ev    |         2|       4
12:00:00|C    |any_ev    |         3|       1
13:00:00|D    |any_ev    |         3|       2
14:00:00|E    |any_ev    |         3|       3
15:00:00|F    |any_ev    |         3|       4
16:00:00|H    |END_ev    |         3|       5
17:00:00|E    |any_ev    |         4|       1
18:00:00|F    |any_ev    |         4|       2
19:00:00|G    |any_ev    |         4|       3
20:00:00|H    |END_ev    |         4|       4

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM