I have a table that looks as follows:
TS | Serial Number | Activity | Address |
---|---|---|---|
1 | 123456 | AAAABBBBCCCC | |
2 | 123456 | AAAABBBBCCCC | |
3 | 123456 | A | AAAABBBBCCCC |
4 | 123456 | E | AAAABBBBCCCC |
5 | 876543 | A | UNIUNIUNIUNI |
6 | 123456 | A | AAAABBBBCCCC |
7 | 123456 | E | WAHWAHWAHWAH |
8 | 123456 | WAHWAHWAHWAH | |
9 | 876543 | E | ALFALFALFALF |
10 | 876543 | ALFALFALFALF |
TS
is a timestamp column that usually contains an ISO date string. I've shortened this for simplicity.
As you can see, a change in the Address
field CAN occur whenever there's an Activity = E
.
The ungrouped rows can be in semi-arbitrary order, though each Activity A
within a group, when sorted by timestamp ( TS
), MUST always be followed by an Activity E
, however not necessarily immediately. There CAN be <null>
Activities in between the A
and E
. If there is no E
following the last A
within a group, sorted by TS
, the corresponding Serial Number
can safely be considered invalid.
For each Serial Number
, sorted by TS
in ascending order, I need the Address
of the last occurrence of Activity = E
, if and only if that last E
is NOT followed by another A
, otherwise Address
may contain INVALID
or alternatively the corresponding Serial Number
can be omitted from the result.
SELECT DISTINCT ON (ser_no) -- 4
*
FROM (
SELECT
*,
MAX(ts) FILTER (WHERE activity = 'A') OVER (PARTITION BY ser_no) as last_a, -- 1
MAX(ts) FILTER (WHERE activity = 'E') OVER (PARTITION BY ser_no) as last_e
FROM
mytable
) s
WHERE last_a < last_e -- 2
AND activity = 'E' -- 3
ORDER BY ser_no, ts DESC -- 4
A
and last E
using the MAX()
window functionser_no
partitions where last A
was before last E
E
recordsE
records by timestamp DESC
, to get the most recent the top-most record per group and remove all others using the DISTINCT ON
clausYou need any "E" row not followed by any "A" or "E" with the same serial number.
This translates in SQL as:
SELECT Serial_Number, Address
FROM Tbl ret
WHERE Activity = 'E'
AND NOT EXISTS (
SELECT *
FROM Tbl witness
WHERE witness.Serial_Number = ret.Serial_Number
AND witness.TS > ret.TS
AND witness.Activity IN ('A', 'E')
);
Hmmm. . . You can use distinct on
if you want to include the invalid records:
select ser_no, ts,
(case when activity = 'E' then address
else 'INVALID'
end)
from t
where activity in ('E', 'A')
order by (ser_no, ts desc);
This just gets the last E/A row for each ser_no
and assigns the address accordingly.
If you want to remove them, then you can still manage without a subquery. It would be nice if Postgres had a "first"/"last" aggregation function, but you can mimic it with arrays:
select ser_no, max(ts),
(array_agg(address order by ts desc))[1] as last_address
from t
where activity in ('E', 'A')
group by ser_no
having max(ts) filter (where activity = 'E') > max(ts) filter (where activity = 'A');
With a subquery, I would suggest:
select t.*
from t
where t.activity = 'E' and
t.ts = (select max(t2.ts)
from t t2
where t2.ser_no = t.ser_no and
t2.activity in ('A', 'E')
);
This fetches the last "E" row when it is the last row for either E or A.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.