I am struggling to find any info on this on the internet after a couple of hours of searching, trial, error and failure. We have the following table structure:
Name | EventDateTime | Mark |
---|---|---|
Dave | 2021-03-24 09:00:00 | Present |
Dave | 2021-03-24 14:00:00 | Absent |
Dave | 2021-03-25 09:00:00 | Absent |
Dave | 2021-03-26 09:00:00 | Absent |
Dave | 2021-03-27 09:00:00 | Present |
Dave | 2021-03-27 14:00:00 | Absent |
Dave | 2021-03-28 09:00:00 | Absent |
Dave | 2021-03-29 10:00:00 | Absent |
Dave | 2021-03-30 13:00:00 | Absent |
Jane | 2021-03-30 13:00:00 | Absent |
Basically registers for people for events. We need to pull a report to see who we have not had contact from for more x consecutive days. Consecutive meaning for the days that they have events in the data not consecutive calendar days. Also if there is a present on one of the days where they were also absent the count needs to start again from the next day they were absent.
The first issue I've got is getting distinct dates where there are only absences, then the 2nd is getting the number of consecutive days of absences - I've done the 2nd in MySQL with variables but struggled to migrate this over to PostgreSQL where the reporting is done from.
An example of the output I'd want is:
Name | EventDateTime | Mark | ConsecCount |
---|---|---|---|
Dave | 2021-03-24 09:00:00 | Present | 0 |
Dave | 2021-03-24 14:00:00 | Absent | 0 |
Dave | 2021-03-25 09:00:00 | Absent | 1 |
Dave | 2021-03-26 09:00:00 | Absent | 2 |
Dave | 2021-03-27 09:00:00 | Present | 0 |
Dave | 2021-03-27 14:00:00 | Absent | 0 |
Dave | 2021-03-28 09:00:00 | Absent | 1 |
Dave | 2021-03-29 10:00:00 | Absent | 2 |
Dave | 2021-03-30 13:00:00 | Absent | 3 |
Jane | 2021-03-30 13:00:00 | Absent | 0 |
This table is currently at 639931 records and they have been generated since 1st October and will continue to grow at this rate.
Any help, or advise on where to start that would be great.
You can get the result that you want by numbering the rows by people and then for each row query previous 'Present' row using lateral join.
WITH with_row_numbers AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY EventDateTime)
FROM events e
)
SELECT
t1.Name,
t1.EventDateTime,
t1.Mark,
GREATEST(0, t1.ROW_NUMBER - COALESCE(sub.prev_present_row_number, 0) - 1) AS ConsecCount
FROM with_row_numbers AS t1
CROSS JOIN LATERAL (
SELECT MAX(row_number) AS prev_present_row_number
FROM with_row_numbers t2
WHERE t2.Name = t1.Name
AND t2.EventDateTime <= t1.EventDateTime
AND t2.Mark = 'Present'
) sub
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.