[英]SQL: Update GROUP BY to include a value based on max value of another column
在查询中使用GROUP BY
语句和聚合函数时,如何从列中添加特定值?
这是我表的一个示例:
id | year | quarter | wage | comp_id | comp_industry |
123 | 2012 | 1 | 1000 | 456 | abc |
123 | 2012 | 1 | 2000 | 789 | def |
123 | 2012 | 2 | 1500 | 789 | def |
456 | 2012 | 1 | 2000 | 321 | ghi |
456 | 2012 | 2 | 2000 | 321 | ghi |
为了按quarter
和wage
计算每个人的wage
值之和,我运行了以下查询:
SELECT SUM(wage) AS sum_wage
FROM t1
GROUP BY id, year, quarter, sum_wage;
结果是
id | year | quarter | sum_wage |
123 | 2012 | 1 | 3000 |
123 | 2012 | 2 | 1500 |
456 | 2012 | 1 | 2000 |
456 | 2012 | 2 | 2000 |
我想更新查询以包括comp_industry
列,其中每个quarter
和year
个人的wage
最高。 我不确定从哪里开始,所以我只返回人们每个quarter
和year
赚钱最多的行业。
id | year | quarter | sum_wage | comp_industry
123 | 2012 | 1 | 3000 | def
123 | 2012 | 2 | 1500 | def
456 | 2012 | 1 | 2000 | ghi
456 | 2012 | 2 | 2000 | ghi
我看过基于另一列分组的另一列的最大值获取值,并获取具有列的最大值的行,但是不确定从那里开始。
任何帮助或建议,将不胜感激!
您可以尝试将窗口函数与SUM
和ROW_NUMBER
一起使用。
使行号按id
, year
, quarter
列按wage
desc排序,然后得到rn = 1
。
模式(PostgreSQL v9.6)
CREATE TABLE T (
id INT,
year INT,
quarter INT,
wage INT,
comp_id INT,
comp_industry VARCHAR(50)
);
INSERT INTO T VALUES (123 , 2012 , 1 , 1000 , 456 ,'abc');
INSERT INTO T VALUES (123 , 2012 , 1 , 2000 , 789 ,'def');
INSERT INTO T VALUES (123 , 2012 , 2 , 1500 , 789 ,'def');
INSERT INTO T VALUES (456 , 2012 , 1 , 2000 , 321 ,'ghi');
INSERT INTO T VALUES (456 , 2012 , 2 , 2000 , 321 ,'ghi');
查询#1
SELECT id, year,quarter ,sum_wage, comp_industry FROM (
SELECT *,
SUM(wage) OVER (PARTITION BY id, year, quarter order by year ) sum_wage,
ROW_NUMBER() OVER (PARTITION BY id, year, quarter order by wage desc) rn
FROM T
) t1
where rn = 1;
| id | year | quarter | sum_wage | comp_industry |
| --- | ---- | ------- | -------- | ------------- |
| 123 | 2012 | 1 | 3000 | def |
| 123 | 2012 | 2 | 1500 | def |
| 456 | 2012 | 1 | 2000 | ghi |
| 456 | 2012 | 2 | 2000 | ghi |
我不是100%肯定我理解这个问题,这对您有用吗?
SELECT id,
year,
quarter,
comp_industry,
SUM(wage)
FROM (SELECT id,
year,
quarter,
comp_industry,
wage
FROM (SELECT TMP.*,
RANK() OVER
( PARTITION BY id,
year,
quarter
ORDER BY wage_sum DESC
) wage_rnk
FROM (SELECT t1.*,
SUM(wage) OVER
( PARTITION BY id,
year,
quarter
) wage_sum
FROM t1
GROUP BY id,
year,
quarter
) TMP
) TMP2
WHERE wage_rnk = 1
) TMP3
GROUP
BY id,
year,
quarter,
comp_industry;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.