简体   繁体   中英

Apply aggregate function to all columns on table with group by

I am trying to select all the columns which will be the same based on the grouping

test_table
+------+-------+---------+----------+-----------+--------------+
| age  | name  |  score  |   col1   |   col2    | col3...col50 |
+------+-------+---------+----------+-----------+--------------+
| 20   | joe   |  10     |   DING   |   DONG    | col3...col50 |
+------+-------+---------+----------+-----------+--------------+
| 20   | joe   |  20     |   DING   |   DONG    | col3...col50 |
+------+-------+---------+----------+-----------+--------------+
| 22   | sue   |  25     |   SING   |   SONG    | col3...col50 |
+------+-------+---------+----------+-----------+--------------+
| 22   | sue   |  10     |   SING   |   SONG    | col3...col50 |
+------+-------+---------+----------+-----------+--------------+
| 50   | bob   |  25     |   RING   |   WRONG   | col3...col50 |
+------+-------+---------+----------+-----------+--------------+
| 44   | joe   |  15     |   THING  |   THONG   | col3...col50 |
+------+-------+---------+----------+-----------+--------------+

The output that I am looking for would be:

+------+-------+---------+----------+-----------+--------------+
| age  | name  |sum(score|   col1   |   col2    | col3...col50 |
+------+-------+---------+----------+-----------+--------------+
| 20   | joe   |  30     |   DING   |   DONG    | col3...col50 |
+------+-------+---------+----------+-----------+--------------+
| 22   | sue   |  35     |   SING   |   SONG    | col3...col50 |
+------+-------+---------+----------+-----------+--------------+
| 50   | bob   |  25     |   RING   |   WRONG   | col3...col50 |
+------+-------+---------+----------+-----------+--------------+
| 44   | joe   |  15     |   THING  |   THONG   | col3...col50 |
+------+-------+---------+----------+-----------+--------------+

I know this isn't right, but my the general thought process is:

select
   min(*),
   sum(score)
from test_table
group by age, name

I want to avoid doing something like:

select 
  min(col1),
  min(col2),
  ... cont ...,
  min(col50),
  sum(score)
from ...

You can't avoid listing all the columns individually. Also, if all the columns where you are using min have the same values for each combination of group by columns, then using min will be very inefficient - just list them in your select and group by clauses

You can use DISTINCT ON to get one row per group and join that with total scores calculated by a GROUP BY query. With this approach there will be score column containing value from some row in a group and a separate column for total score.

WITH total_scores AS (
    SELECT age, name, SUM(score) AS total_score
    FROM test_table
    GROUP BY age, name
)
SELECT DISTINCT ON (tt.age, tt.name)
    tt.*, ts.total_score
FROM test_table tt
JOIN total_scores ts ON tt.age = ts.age AND tt.name = ts.name

That said, it seems that you could normalize your data into two tables, one containing rows that have duplicate values (ie everything else except score ) and another table containing score and a foreign key pointing to the first table.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM