简体   繁体   English

给定SQL中的初始匹配项,如何在有序表中查找与条件匹配的下一行

[英]How to find next row in ordered table that matches a condition, given an initial match in SQL

I'm querying a table that contains state transitions for a state engine.我正在查询一个包含状态引擎的状态转换的表。 The table is set up so that it has the previous_state , current_state , and timestamp of the transition, grouped by unique id s.该表已设置为具有按唯一id分组的转移的previous_statecurrent_statetimestamp

My goal is to find a sequence of target intervals, defined as timestamp of the initial state transition (eg timestamp when we shift from from 1->2), and timestamp of the target next state transition that matches a specific condition (eg the next timestamp that current_state=3 OR current_state=4).我的目标是找到一系列目标间隔,定义为初始状态转换的时间戳(例如,当我们从 1->2 转换时的时间戳),以及匹配特定条件的目标下一个状态转换的时间戳(例如下一个current_state=3 或 current_state=4 的时间戳)。

state_transition_table
+------------+---------------+-----------+----+
| prev_state | current_state | timestamp | id |
+------------+---------------+-----------+----+
|          1 |             2 |       4.5 |  1 |
|          2 |             3 |       5.2 |  1 |
|          3 |             1 |       5.4 |  1 |
|          1 |             2 |      10.3 |  1 |
|          2 |             5 |      10.4 |  1 |
|          5 |             4 |      10.8 |  1 |
|          4 |             1 |      11.0 |  1 |
|          1 |             2 |      12.3 |  1 |
|          2 |             3 |      13.5 |  1 |
|          3 |             1 |      13.6 |  1 |
+------------+---------------+-----------+----+

Within a given id, we want to find all intervals that start with 1->2 (easy enough query), and end with either state 3 or 4. 1->2-> anything ->3 or 4在给定的 id 中,我们想要找到所有以 1->2(足够简单的查询)开始,并以状态 3 或 4 结束的间隔。1->2->任何->3 或 4

An example output table given the input above would have the three states and the timestamps for when we transition between the states:给出上面输入的示例输出表将具有三个状态和我们在状态之间转换时的时间戳:

target output
+------------+---------------+------------+-----------+-----------+
| prev_state | current_state | end_state  | curr_time | end_time  |
+------------+---------------+------------+-----------+-----------+
|          1 |             2 |          3 |       4.5 |       5.2 |
|          1 |             2 |          4 |      10.3 |      10.8 |
|          1 |             2 |          3 |      12.3 |      13.5 |
+------------+---------------+------------+-----------+-----------+

The best query I could come up with is using window functions in a sub-table, and then creating the new columns from that table.我能想到的最好的查询是在子表中使用窗口函数,然后从该表中创建新列。 But this solution only finds the next row following the initial transition, and doesnt allow other states to occur between then and when our target state arrives.但是这个解决方案找到初始转换之后的下一行,并且不允许在我们的目标状态到达之间发生其他状态。

WITH state_transitions as (
SELECT
  id
  previous_state, current_state,
  LEAD(current_state) OVER ( PARTITION BY id ORDER BY timestamp) AS end_state,
  timestamp as curr_time,
  LEAD(timestamp) OVER ( PARTITION BY id ORDER BY timestamp) AS end_time
FROM
  state_transition_table

SELECT
  previous_state,
  current_state,
  end_state,
  curr_time,
  end_time
FROM state_transitions
WHERE previous_state=1 and current_state=2
ORDER BY curr_time

This query would incorrectly give the second output row end_state==5 , which is not what I am looking for.这个查询会错误地给出第二个输出行end_state==5 ,这不是我要找的。

How can one search a table for the next row that matches my target condition, eg end_state=3 OR end_state=4 ?如何在表中搜索与我的目标条件匹配的下一行,例如end_state=3 OR end_state=4

This requires a recursive query that checks each row against siblings.这需要一个递归查询,根据兄弟姐妹检查每一行。 This query should account for more than three rows.此查询应占三行以上。 I assumed ORACLE for the seed data, may need to adapt your syntax to your database engine.我假设 ORACLE 用于种子数据,可能需要使您的语法适应您的数据库引擎。 I tried to document the query as best as I thought it was needed.我试图尽可能地记录查询,因为我认为它是必要的。

WITH /*SEED DATA*/
  state_transition_table(prev_state, current_state, time_stamp, id) as (
              SELECT          1 ,             2 ,       4.5 ,  1 --FROM DUAL
    UNION ALL SELECT          2 ,             3 ,       5.2 ,  1 --FROM DUAL
    UNION ALL SELECT          3 ,             1 ,       5.4 ,  1 --FROM DUAL
    UNION ALL SELECT          1 ,             2 ,      10.3 ,  1 --FROM DUAL
    UNION ALL SELECT          2 ,             5 ,      10.4 ,  1 --FROM DUAL
    UNION ALL SELECT          5 ,             4 ,      10.8 ,  1 --FROM DUAL
    UNION ALL SELECT          4 ,             1 ,      11.0 ,  1 --FROM DUAL
    UNION ALL SELECT          1 ,             2 ,      12.3 ,  1 --FROM DUAL
    UNION ALL SELECT          2 ,             3 ,      13.5 ,  1 --FROM DUAL
    UNION ALL SELECT          3 ,             1 ,      13.6 ,  1 --FROM DUAL
)

/*THE END STATES YOU ARE LOOKING FOR*/
, end_states (a_state) as (
              select 3 --FROM DUAL
    union all select 4 --FROM DUAL
)

/*ORDER THE STEPS TO USE THE order_id COLUMN TO EVALUATE THE NEXT NODE*/
, ordered_states as (
    SELECT row_number() OVER (ORDER BY time_stamp)  order_id
         , prev_state
         , current_state
         , id
         , time_stamp
    FROM   state_transition_table
)

/*RECURSIVE QUERY WITH ANSI SYNTAX*/
, recursive (
           root_order_id
         , order_id
         , time_stamp
         , prev_state
         , current_state
         --, id

         , steps
  )
as (
    SELECT order_id root_order_id /*THE order_id OF EACH ROOT ROW*/
         , order_id
         , time_stamp
         , prev_state
         , current_state

         , CAST(order_id as char(100)) as steps /*INITIAL VALIDATION PATH*/
    FROM   ordered_states
    WHERE  prev_state = 1 AND current_state = 2 /*INITIAL CONDITION*/

    UNION ALL
    SELECT prev.root_order_id
         , this.order_id
         , this.time_stamp
         , prev.prev_state
         , this.current_state

         , CAST(CONCAT(CONCAT(RTRIM(LTRIM(prev.steps)), ', '), RTRIM(LTRIM(CAST(this.order_id as char(3))))) as char(100)) as steps
    FROM   recursive prev /*ANSI PSEUDO TABLE*/
         , ordered_states this /*THE SIBLING ROW TO CHECK*/

    WHERE prev.order_id = this.order_id - 1 /*ROW TO PREVIOUS ROW JOIN*/
      and prev.current_state not in (select a_state from end_states) /*THE PREVIOUS ROW STATE IS NOT AN END STATE */
)

select init_state.prev_state
     , init_state.current_state as mid_state /*this name is better, I think*/
     , end_state.current_state
     , init_state.time_stamp as initial_time /*initial_time is better, I think*/
     , end_state.time_stamp as end_time /*end_time is better, I think*/
     , recursive.steps as validation_path_by_order_id
from   recursive
inner join ordered_states init_state
    on init_state.order_id = recursive.root_order_id
inner join ordered_states end_state
    on end_state.order_id = recursive.order_id
where  recursive.current_state in (select a_state from end_states)

One final note.最后一点。 The resulting columns are only accounting for 3 rows (prev_state, mid_state and current_state).结果列仅占 3 行(prev_state、mid_state 和 current_state)。 As I said above, there are cases where you can have a path from (1) to (2) to (3 or 4) with more than three rows, lets say 1 to 2 to 5 to 2 to 3, thus the mid_state is really just one state in the middle.正如我上面所说,在某些情况下,您可以拥有从 (1) 到 (2) 到 (3 或 4) 的超过三行的路径,比如说 1 到 2 到 5 到 2 到 3,因此 mid_state 是真的只是中间的一个州。

Final-final note: Your desired results table was wrong, but you corrected it.最后最后一点:你想要的结果表是错误的,但你更正了它。 👍 👍

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM