简体   繁体   English

无法通过非ID列执行Postgres分组,以获取包含最大值的ID

[英]Trouble performing Postgres group by non-ID column to get ID containing max value

I'm attempting to perform a GROUP BY on a join table table. 我正在尝试在联接表表上执行GROUP BY。 The join table essentially looks like: 联接表基本上看起来像:

CREATE TABLE user_foos (
    id SERIAL PRIMARY KEY,
    user_id INT NOT NULL,
    foo_id INT NOT NULL,
    effective_at DATETIME NOT NULL
);
ALTER TABLE user_foos
    ADD CONSTRAINT user_foos_uniqueness
    UNIQUE (user_id, foo_id, effective_at);

I'd like to query this table to find all records where the effective_at is the max value for any pair of user_id, foo_id given. 我想查询该表以查找所有记录,其中, effective_at是给定的任何对user_id, foo_id I've tried the following: 我尝试了以下方法:

SELECT "user_foos"."id",
       "user_foos"."user_id",
       "user_foos"."foo_id",
       max("user_foos"."effective_at")
FROM "user_foos"
GROUP BY "user_foos"."user_id", "user_foos"."foo_id";

Unfortunately, this results in the error: 不幸的是,这导致错误:

column "user_foos.id" must appear in the GROUP BY clause or be used in an aggregate function 列“ user_foos.id”必须出现在GROUP BY子句中或在聚合函数中使用

I understand that the problem relates to "id" not being used in an aggregate function and that the DB doesn't know what to do if it finds multiple records with differing ID's, but I know this could never happen due to my trinary primary key across those columns ( user_id , foo_id , and effective_at ). 我了解该问题与聚合函数中未使用“ id”有关,并且数据库不知道如果找到具有不同ID的多个记录该怎么办,但是我知道由于我的三进制主键,这永远不会发生横跨这些列( user_idfoo_id ,和effective_at )。

To work around this, I also tried a number of other variants such as using the first_value window function on the id : 要解决此问题,我还尝试了许多其他变体,例如在id上使用first_value窗口函数

SELECT first_value("user_foos"."id"),
       "user_foos"."user_id",
       "user_foos"."foo_id",
       max("user_foos"."effective_at")
FROM "user_foos"
GROUP BY "user_foos"."user_id", "user_foos"."foo_id";

and: 和:

SELECT first_value("user_foos"."id")
FROM "user_foos"
GROUP BY "user_foos"."user_id", "user_foos"."foo_id"
HAVING "user_foos"."effective_at" = max("user_foos"."effective_at")

Unfortunately, these both result in a different error: 不幸的是,这些都导致不同的错误:

window function call requires an OVER clause 窗口函数调用需要OVER子句

Ideally, my goal is to fetch ALL matching id 's so that I can use it in a subquery to fetch the legitimate full row data from this table for matching records. 理想情况下,我的目标是获取所有匹配的id ,以便我可以在子查询中使用它来从此表中获取合法的全行数据以匹配记录。 Can anyone provide insight on how I can get this working? 谁能提供我该如何工作的见解?

Postgres has a very nice feature called distinct on , which can be used in this case: Postgres有一个非常好的功能,称为distinct on ,可以在这种情况下使用:

SELECT DISTINCT ON (uf."user_id", uf."foo_id") uf.*
FROM "user_foos" uf
ORDER BY uf."user_id", uf."foo_id", uf."effective_at" DESC;

It returns the first row in a group, based on the values in parentheses. 它根据括号中的值返回组中的第一行。 The order by clause needs to include these values as well as a third column for determining which is the first row in the group. order by子句需要包括这些值以及第三列,以确定哪个是组中的第一行。

Try: 尝试:

SELECT *
FROM (
  SELECT t.*,
         row_number() OVER( partition by user_id, foo_id ORDER BY effective_at DESC ) x
  FROM user_foos t
)
WHERE x = 1

If you don't want to use a sub query based on a composite of all three keys then you need to create a "dense rank" window function field that orders subsets of id, user_id and foo_id by effective date with the rank order field. 如果您不想使用基于所有三个键的组合的子查询,则需要创建一个“密集排名”窗口函数字段,该字段按有效日期和排名顺序字段对id,user_id和foo_id的子集进行排序。 Then subquery that and take the records where rank_order=1. 然后子查询并获取记录,其中rank_order = 1。 Since the rank ordering was by effective date you are getting all fields of the record with the highest effective date for each foo and user. 由于排名是按生效日期排序的,因此您将获得记录中所有foo和用户的生效日期最高的所有字段。

DATSET
1 1 1 01/01/2001
2 1 1 01/01/2002
3 1 1 01/01/2003
4 1 2 01/01/2001
5 2 1 01/01/2001

DATSET WITH RANK ORDER PARTITIONED BY FOO_ID, USER_ID ORDERED BY DATE DESC
1 3 1 1 01/01/2001
2 2 1 1 01/01/2002
3 1 1 1 01/01/2003
4 1 1 2 01/01/2001
5 1 2 1 01/01/2001

SELECT * FROM QUERY ABOVE WHERE RANK_ORDER=1
3 1 1 1 01/01/2003
4 1 1 2 01/01/2001
5 1 2 1 01/01/2001

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM