I have a table account
with the fallowing structure:
| agg_type | agg_id | sequence | payload | is_snapshot | timestamp |
| "account" | "agg_1" | 1 | "..." | false | ... |
| "account" | "agg_1" | 2 | "..." | true | ... |
| "account" | "agg_1" | 3 | "..." | false | ... |
| "account" | "agg_1" | 4 | "..." | false | ... |
| "account" | "agg_1" | 5 | "..." | false | ... |
| "account" | "agg_1" | 6 | "..." | false | ... |
| "account" | "agg_1" | 7 | "..." | true | ... |
| "account" | "agg_1" | 8 | "..." | false | ... |
I need to write a query that will retrieve all rows from this table from the latest snapshot onward of an specific aggregate. For instance, in the case of this table the query would return the last two rows (sequences 7 and 8).
I think that the query would go something like
SELECT * FROM account
WHERE
agg_type='account'
AND agg_id='agg_1'
ORDER BY sequence ASC
LIMIT (???);
It's the (???)
part that I'm not quite sure on how to implement.
Obs:
Simplistically we can just retrieve all accounts where the sequence is greater than or equal to the highest sequence id that is a snapshot
SELECT * FROM account a
WHERE
a.agg_type='account'
AND a.agg_id='agg_1'
AND a.sequence >=
(SELECT MAX(sequence) FROM account b WHERE a.agg_type = b.agg_type AND a.agg_id = b. agg_id AND b.is_snapshot = true)
If you wanted to do them all it might be clearer to write it as a join:
SELECT a.*
FROM
account a
INNER JOIN
(
SELECT
agg_type,
agg_id,
MAX(sequence) as maxseq
FROM account b
GROUP BY agg_type, add_id
) maxes
ON
a.agg_type = maxes.agg_type and
maxes.agg_id = a.max_id and
a.sequence >= maxes.maxseq
That's not to say we couldn't do either task with either form (and internally postgres will probably execute them the same anyway), but I've always felt that using a join as a restriction of "here are 10000 rows, and I want only the 2000 rows that meet a criteria laid down by these 1000 rows" is most clearly thought of in terms of blocks of data that are joined together
WITH a AS ( SELECT *,row_number() over(partition BY a.agg_type,a.agg_id ORDER BY a."SEQUENCE" DESC) rnk FROM account a ) SELECT * FROM a WHERE a.rnk <= 2;
A window function can pull this for all (agg_type, agg_id)
combinations with only one sort:
with mark as (
select *,
bool_or(is_snapshot) over w as trail_true
from account
window w as (partition by agg_type, agg_id
order by sequence
rows between 1 following
and unbounded following)
)
select *
from mark
where not coalesce(trail_true, false)
order by agg_type, agg_id, sequence
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.