简体   繁体   English

Postgres选择所有列,但按列分组

[英]Postgres select all columns but group by one column

I have a simple table with a unit_id oid, time timestamp, diag bytea. 我有一个简单的表,其中包含unit_id oid,time timestamp,diag bytea。 The primary key is a combination of both time and unit_id. 主键是time和unit_id的组合。

The idea behind this query is to get the latest row (largest timestamp) for each unique unit_id. 此查询背后的想法是获取每个唯一unit_id的最新行(最大时​​间戳)。 However the rows for each unit_id with the latest time are not always returned. 但是,并不总是返回具有最新时间的每个unit_id的行。

I really want to group by just the unit_id, but postgres makes me use diag also, since I am selecting that. 我真的想通过unit_id进行分组,但是postgres也让我使用了diag,因为我选择了它。

SELECT DISTINCT ON(unit_id) max(time) as time, diag, unit_id 
FROM diagnostics.unit_diag_history  
GROUP BY unit_id, diag

Any time you start thinking that you want a localized GROUP BY you should start thinking about window functions instead. 每当你开始认为你想要一个本地化的GROUP BY时,你应该开始考虑窗口函数

I think you're after something like this: 我想你是在追求这样的事情:

select unit_id, time, diag
from (
    select unit_id, time, diag,
           rank() over (partition by unit_id order by time desc) as rank
    from diagnostics.unit_diag_history
) as dt
where rank = 1

You might want to add something to the ORDER BY to consistently break ties as well but that wouldn't alter the overall technique. 您可能希望向ORDER BY添加一些内容以始终断开关系,但这不会改变整体技术。

You can join the grouped select with the original table: 您可以将分组选择与原始表一起加入:

SELECT d.time, d.diag, d.unit_id
FROM(
    SELECT unit_id, max(time) as max_time
    FROM diagnostics.unit_diag_history
    GROUP BY unit_id
) s JOIN diagnostics.unit_diag_history d
ON s.unit_id = d.unit_id AND s.max_time = d.time

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM