简体   繁体   中英

SQL - aggregating rows of data

I have the following table on intermediate steps of a journeys that I would like to aggregate to get one row per person, per day. The intermediate steps might involve the passenger exiting and entering various gates at a station which will always follow one after the other.

So, in the table below, passenger 1234 exits at station 5598 at time 1071 and then enters at station 796 at time 1073 (the times are coded to numerical values). They then exit at station 635 at time 1086 followed by an entry at station 5148 at time 1088. This particular passenger has 2 intermediate legs in their journey. For passenger 5678, they only have one intermedate leg.

The table is as follows:

 ID    day    station    time    type   
1234    133    5598      1071    exit
1234    133    796       1073    entry
1234    133    635       1086    exit 
1234    133    5148      1088    entry 
5678    133    8909      1305    exit 
5678    133    5158      1306   entry 

and I want to get it to look like this:

ID    day    stage1_exittime    stage1_exitstation    stage2_entrytime    stage2_entrystation    stage2_exittime    stage2_exitstation    stage3_entrytime    stage3_entrystation
1234  133    1071               5598                  1073                796                     1086                635                 1088                 5148
5678  133    1305               8909                  1306                5158                    0                    0                   0                    0

I have tried first_value, over and partition by, but can't get it to work. They key is I need to ensure those journeys with only 1 intermediate leg are not populated at stage 2_exit and stage 3 in the table above.

It should be noted that the passenger might have up to 5 intermediate legs in their journey (not 3 as the example shows).

this should help you get your result.

the row_number orders the entries and exits and you just need to get the correct row_number by type to determine the order.

SELECT  "ID",
        "day",
        MAX(CASE WHEN Rn = 1 AND "type" = 'exit' THEN "time" END) AS stage1_exittime, 
        MAX(CASE WHEN Rn = 1 AND "type" = 'exit' THEN "station" END) AS stage1_exitstation,
        MAX(CASE WHEN Rn = 1 AND "type" = 'entry' THEN "time" END) AS stage2_entrytime,
        MAX(CASE WHEN Rn = 1 AND "type" = 'entry' THEN "station" END) AS stage2_entrystation,
        MAX(CASE WHEN Rn = 2 AND "type" = 'exit' THEN "time" END) AS stage2_exittime,
        MAX(CASE WHEN Rn = 2 AND "type" = 'exit' THEN "station" END) AS stage2_exitstation,
        MAX(CASE WHEN Rn = 2 AND "type" = 'entry' THEN "time" END) AS stage3_entrytime,
        MAX(CASE WHEN Rn = 2 AND "type" = 'entry' THEN "station" END) AS stage3_entrystation
FROM    (   
            SELECT  "ID",
                    "station",
                    "time",
                    "type",
                    "day",
                    ROW_NUMBER() OVER (PARTITION BY "ID", "day", "type" ORDER BY "time") AS Rn
            FROM    myTable 
        ) mt
GROUP BY "ID",
        "day"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM