简体   繁体   中英

Count consecutive recurring values

I am struggling to find any info on this on the internet after a couple of hours of searching, trial, error and failure. We have the following table structure:

Name EventDateTime Mark
Dave 2021-03-24 09:00:00 Present
Dave 2021-03-24 14:00:00 Absent
Dave 2021-03-25 09:00:00 Absent
Dave 2021-03-26 09:00:00 Absent
Dave 2021-03-27 09:00:00 Present
Dave 2021-03-27 14:00:00 Absent
Dave 2021-03-28 09:00:00 Absent
Dave 2021-03-29 10:00:00 Absent
Dave 2021-03-30 13:00:00 Absent
Jane 2021-03-30 13:00:00 Absent

Basically registers for people for events. We need to pull a report to see who we have not had contact from for more x consecutive days. Consecutive meaning for the days that they have events in the data not consecutive calendar days. Also if there is a present on one of the days where they were also absent the count needs to start again from the next day they were absent.

The first issue I've got is getting distinct dates where there are only absences, then the 2nd is getting the number of consecutive days of absences - I've done the 2nd in MySQL with variables but struggled to migrate this over to PostgreSQL where the reporting is done from.

An example of the output I'd want is:

Name EventDateTime Mark ConsecCount
Dave 2021-03-24 09:00:00 Present 0
Dave 2021-03-24 14:00:00 Absent 0
Dave 2021-03-25 09:00:00 Absent 1
Dave 2021-03-26 09:00:00 Absent 2
Dave 2021-03-27 09:00:00 Present 0
Dave 2021-03-27 14:00:00 Absent 0
Dave 2021-03-28 09:00:00 Absent 1
Dave 2021-03-29 10:00:00 Absent 2
Dave 2021-03-30 13:00:00 Absent 3
Jane 2021-03-30 13:00:00 Absent 0

This table is currently at 639931 records and they have been generated since 1st October and will continue to grow at this rate.

Any help, or advise on where to start that would be great.

You can get the result that you want by numbering the rows by people and then for each row query previous 'Present' row using lateral join.

WITH with_row_numbers AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY EventDateTime)
    FROM events e 
)
SELECT
    t1.Name,
    t1.EventDateTime,
    t1.Mark,
    GREATEST(0, t1.ROW_NUMBER - COALESCE(sub.prev_present_row_number, 0) - 1) AS ConsecCount
FROM with_row_numbers AS t1
CROSS JOIN LATERAL (
    SELECT MAX(row_number) AS prev_present_row_number
    FROM with_row_numbers t2
    WHERE t2.Name = t1.Name
    AND t2.EventDateTime <= t1.EventDateTime
    AND t2.Mark = 'Present'
) sub

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM