简体   繁体   中英

Finding consecutive patterns (with SQL)

A table consecutive in PostgreSQL: Each se_id has an idx from 0 up to 100 - here 0 to 9.

The search pattern:

SELECT *
FROM consecutive
WHERE val_3_bool = 1
AND val_1_dur > 4100 AND val_1_dur < 5900

sorce_table

Now I'm looking for the longest consecutive appearance of this pattern for each p_id - and the AVG of the counted val_1_dur .

result_table

Is it possible to calculate this in pure SQL?

table as txt "Result" as txt

One method is the difference of row numbers approach to get the sequences for each:

select pid, count(*) as in_a_row, sum(val1_dur) as dur
from (select t.*,
             row_number() over (partition by pid order by idx) as seqnum,
             row_number() over (partition by pid, val3_bool order by idx) as seqnum_d
      from consecutive t
     ) t
group by (seqnun - seqnum_d), pid, val3_bool;

If you are looking specifically for "1" values, then add where val3_bool = 1 to the outer query. To understand why this works, I would suggest that you stare at the results of the subquery, so you can understand why the difference defines the consecutive values.

You can then get the max using distinct on :

select distinct on (pid) t.*
from (select pid, count(*) as in_a_row, sum(val1_dur) as dur
      from (select t.*,
                   row_number() over (partition by pid order by idx) as seqnum,
                   row_number() over (partition by pid, val3_bool order by idx) as seqnum_d
            from consecutive t
           ) t
      group by (seqnun - seqnum_d), pid, val3_bool;
     ) t
order by pid, in_a_row desc;

The distinct on does not require an additional level of subquery, but I think that makes the logic clearer.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM