简体   繁体   English

根据字段值限制查询结果

[英]Limit query result based on field value

I have a table account with the fallowing structure:我有一个具有休闲结构的表account

| agg_type  | agg_id  | sequence | payload | is_snapshot | timestamp |
| "account" | "agg_1" | 1        | "..."   | false       | ...       |
| "account" | "agg_1" | 2        | "..."   | true        | ...       |
| "account" | "agg_1" | 3        | "..."   | false       | ...       |
| "account" | "agg_1" | 4        | "..."   | false       | ...       |
| "account" | "agg_1" | 5        | "..."   | false       | ...       |
| "account" | "agg_1" | 6        | "..."   | false       | ...       |
| "account" | "agg_1" | 7        | "..."   | true        | ...       |
| "account" | "agg_1" | 8        | "..."   | false       | ...       |

I need to write a query that will retrieve all rows from this table from the latest snapshot onward of an specific aggregate.我需要编写一个查询,该查询将从特定聚合的最新快照开始从该表中检索所有行。 For instance, in the case of this table the query would return the last two rows (sequences 7 and 8).例如,在此表的情况下,查询将返回最后两行(序列 7 和 8)。

I think that the query would go something like我认为查询将 go 类似于

SELECT * FROM account 
WHERE
  agg_type='account'
  AND agg_id='agg_1'
ORDER BY sequence ASC
LIMIT (???);

It's the (???) part that I'm not quite sure on how to implement.这是我不太确定如何实施的(???)部分。

Obs:观察:

  • I'm using Postgres if it is of any help.如果有任何帮助,我正在使用 Postgres。
  • The (agg_type, agg_id, sequence) combination is a primary key. (agg_type, agg_id, sequence) 组合是主键。

Simplistically we can just retrieve all accounts where the sequence is greater than or equal to the highest sequence id that is a snapshot简单地说,我们可以只检索序列大于或等于快照的最高序列 id 的所有帐户

SELECT * FROM account a
WHERE
  a.agg_type='account'
  AND a.agg_id='agg_1' 
  AND a.sequence >= 
    (SELECT MAX(sequence) FROM account b WHERE a.agg_type = b.agg_type AND a.agg_id = b. agg_id AND b.is_snapshot = true)

If you wanted to do them all it might be clearer to write it as a join:如果您想全部完成,则将其编写为联接可能会更清楚:

SELECT a.* 
FROM 
  account a
  INNER JOIN
  (
    SELECT 
      agg_type, 
      agg_id, 
      MAX(sequence) as maxseq 
    FROM account b 
    GROUP BY agg_type, add_id
  ) maxes
  ON 
    a.agg_type = maxes.agg_type and
    maxes.agg_id = a.max_id and
    a.sequence >= maxes.maxseq

That's not to say we couldn't do either task with either form (and internally postgres will probably execute them the same anyway), but I've always felt that using a join as a restriction of "here are 10000 rows, and I want only the 2000 rows that meet a criteria laid down by these 1000 rows" is most clearly thought of in terms of blocks of data that are joined together这并不是说我们不能用任何一种形式完成任何一项任务(并且内部 postgres 可能无论如何都会执行它们),但我一直认为使用连接作为“这里有 10000 行,我想要只有符合这 1000 行规定的标准的 2000 行”最清楚地被认为是连接在一起的数据块

WITH a AS ( SELECT *,row_number() over(partition BY a.agg_type,a.agg_id ORDER BY a."SEQUENCE" DESC) rnk FROM account a ) SELECT * FROM a WHERE a.rnk <= 2; WITH a AS ( SELECT *,row_number() over(partition BY a.agg_type,a.agg_id ORDER BY a."SEQUENCE" DESC) rnk FROM account a ) SELECT * FROM a WHERE a.rnk <= 2;

A window function can pull this for all (agg_type, agg_id) combinations with only one sort:一个 window function 可以为所有(agg_type, agg_id)组合仅使用一种排序:

with mark as (
  select *, 
         bool_or(is_snapshot) over w as trail_true
    from account
  window w as (partition by agg_type, agg_id 
                   order by sequence
            rows between 1 following
                     and unbounded following)
)
select *
  from mark
 where not coalesce(trail_true, false)
 order by agg_type, agg_id, sequence

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM