[英]Select first row in a group by, with several columns defining the group
这是一个虚拟表来描述我正在尝试做的事情:
ID_1 | ID_2 | ID_3 | Day | Energy_Costs |
----------+----------+------------+-------+---------------+
State_1 | County_1 | Building_1 | 1 | 48.8 |
State_1 | County_1 | Building_1 | 2 | 31.3 |
State_1 | County_1 | Building_2 | 1 | 20.5 |
State_1 | County_2 | Building_1 | 1 | 1.9 |
State_2 | County_1 | Building_1 | 1 | 6.6 |
State_2 | County_2 | Building_2 | 1 | 38.2 |
State_2 | County_2 | Building_2 | 2 | 12.0 |
在上表中,唯一记录(本例中为建筑物)需要 3 列(ID_1、ID_2、ID_3)。 我想返回一个表,其中包含建筑物的给定日期的第一行。
这是查询在我脑海中的样子:
SELECT FIRST(ID_1), FIRST(ID_2), FIRST(ID_3), FIRST(Energy_Costs), FIRST(DAY)
FROM buildings_db
GROUP BY ID_1, ID_2, ID_3
ORDER BY DAY
这将返回:
ID_1 | ID_2 | ID_3 | Day | Energy_Costs |
----------+----------+------------+-------+---------------+
State_1 | County_1 | Building_1 | 1 | 48.8 |
State_1 | County_1 | Building_2 | 1 | 20.5 |
State_1 | County_2 | Building_1 | 1 | 1.9 |
State_2 | County_1 | Building_1 | 1 | 6.6 |
State_2 | County_2 | Building_1 | 1 | 38.2 |
我看到其他问题提出了类似的问题,但他们通常没有定义一个组的多个列。 我对 SQL 非常陌生,因此将它们转换为我的示例证明是不成功的; 如果你们中的任何一个人可以解释为什么您的解决方案有效,那将非常有帮助。
您可以使用DISTINCT ON ()
。 它适用于任意数量的列来定义一个组:
SELECT DISTINCT ON (ID_1, ID_2, ID_3)
ID_1, ID_2, ID_3, DAY, Energy_Costs
FROM buildings_db
ORDER BY ID_1, ID_2, ID_3, DAY, Energy_Costs;
这将返回(ID_1, ID_2, ID_3)
每个不同组合的第一行,首先由其他ORDER BY
表达式定义。
要得到 ...
建筑物给定日期的第一行:
SELECT DISTINCT ON (ID_1, ID_2, ID_3)
ID_1, ID_2, ID_3, DAY, Energy_Costs
FROM buildings_db
WHERE DAY = 1 -- given day
ORDER BY ID_1, ID_2, ID_3, Energy_Costs
详细解释:
您可以为此使用子查询和JOIN
select b.ID_1, b.ID_2, b.ID_3, b.Energy_Costs, b.DAY
from buildings_db b
join
(
select ID_1, ID_2, ID_3, min(day) min_day
from buildings_db
group by ID_1, ID_2, ID_3
) t on b.id_1 = t.id_1 and
b.id_2 = t.id_2 and
b.id_2 = t.id_2 and
b.day = t.min_day
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.