I have the following table on intermediate steps of a journeys that I would like to aggregate to get one row per person, per day. The intermediate steps might involve the passenger exiting and entering various gates at a station which will always follow one after the other.
So, in the table below, passenger 1234 exits at station 5598 at time 1071 and then enters at station 796 at time 1073 (the times are coded to numerical values). They then exit at station 635 at time 1086 followed by an entry at station 5148 at time 1088. This particular passenger has 2 intermediate legs in their journey. For passenger 5678, they only have one intermedate leg.
The table is as follows:
ID day station time type
1234 133 5598 1071 exit
1234 133 796 1073 entry
1234 133 635 1086 exit
1234 133 5148 1088 entry
5678 133 8909 1305 exit
5678 133 5158 1306 entry
and I want to get it to look like this:
ID day stage1_exittime stage1_exitstation stage2_entrytime stage2_entrystation stage2_exittime stage2_exitstation stage3_entrytime stage3_entrystation
1234 133 1071 5598 1073 796 1086 635 1088 5148
5678 133 1305 8909 1306 5158 0 0 0 0
I have tried first_value, over and partition by, but can't get it to work. They key is I need to ensure those journeys with only 1 intermediate leg are not populated at stage 2_exit and stage 3 in the table above.
It should be noted that the passenger might have up to 5 intermediate legs in their journey (not 3 as the example shows).
this should help you get your result.
the row_number orders the entries and exits and you just need to get the correct row_number by type to determine the order.
SELECT "ID",
"day",
MAX(CASE WHEN Rn = 1 AND "type" = 'exit' THEN "time" END) AS stage1_exittime,
MAX(CASE WHEN Rn = 1 AND "type" = 'exit' THEN "station" END) AS stage1_exitstation,
MAX(CASE WHEN Rn = 1 AND "type" = 'entry' THEN "time" END) AS stage2_entrytime,
MAX(CASE WHEN Rn = 1 AND "type" = 'entry' THEN "station" END) AS stage2_entrystation,
MAX(CASE WHEN Rn = 2 AND "type" = 'exit' THEN "time" END) AS stage2_exittime,
MAX(CASE WHEN Rn = 2 AND "type" = 'exit' THEN "station" END) AS stage2_exitstation,
MAX(CASE WHEN Rn = 2 AND "type" = 'entry' THEN "time" END) AS stage3_entrytime,
MAX(CASE WHEN Rn = 2 AND "type" = 'entry' THEN "station" END) AS stage3_entrystation
FROM (
SELECT "ID",
"station",
"time",
"type",
"day",
ROW_NUMBER() OVER (PARTITION BY "ID", "day", "type" ORDER BY "time") AS Rn
FROM myTable
) mt
GROUP BY "ID",
"day"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.