简体   繁体   中英

SQL group by: select value where another column has its min/max

I want to group by one column, get both min and max of a second column, and (this is the tricky part.) get the value from a third column where the second column has its min value in the group.

Example:

MyTable:

ID     TS     GRP
==================
 1     20      A
 2     20      B
 3     10      A
 4     30      A
 5     10      B
 6     40      A

Desired result (ID should be the value from the record where TS has its minimum):

ID    MIN_TS   MAX_TS   GRP
============================
 3      10       40      A
 5      10       20      B

In general, the grouping query is very easy:

SELECT <???> AS ID, MIN(TS) AS MIN_TS, MAX(TS) AS MAX_TS, GRP
FROM MyTable
GROUP BY GRP

But what about the ID part? It doesn't work this way with grouping, right? But why? And what's the best workaround?

Do the aggregation in a subquery, then look up the ID for each group in another subquery:

SELECT
  (SELECT TOP(1) id FROM MyTable WHERE grp = agg.grp ORDER BY ts DESC) AS id,
  min_ts, max_ts, grp
FROM (SELECT min(ts) AS min_ts, max(ts) AS max_ts, grp
      FROM MyTable
      GROUP BY grp) agg

Or use window functions:

SELECT id, min_ts, max_ts, grp
FROM (SELECT 
        id,
        min(ts) OVER (PARTITION BY grp) min_ts,
        max(ts) OVER (PARTITION BY grp) max_ts,
        grp,
        row_number OVER (PARTITION BY grp ORDER BY ts) rn
      FROM MyTable)
WHERE rn = 1;

This query uses window functions to calculate min_ts and max_ts for each group, and then filters to only include the first row for each group (ordered by ts ).

A bit late, but for future comers...

I can propose another solution that's a bit different than the existing one, it's basically the same idea, but it's implemented in another way (and it's maybe a bit faster?).

So you can basically make all the grouping and aggregation in a sub query (using WITH ), then use INNER JOIN between that query and your original table to get what you want, it would be something like this...

WITH values AS (
  SELECT
    MIN(ts) as min_ts,
    MAX(ts) AS max_ts,
    grp
  FROM MyTable
  GROUP BY grp
)

SELECT
  tb.id AS id,
  v.min_ts AS min_ts,
  v.max_ts AS max_ts,
  tb.grp AS grp
FROM MyTable tb
INNER JOIN values v ON v.grp = tb.grp AND v.min_ts = tb.ts;

This is a bit simpler, and a bit more intuitive (at least to me).

I have tested this on Postgres DB, and for the following data

 id | ts | grp 
----+----+-----
  7 |  5 | A
  3 | 10 | A
  1 | 20 | A
  5 | 30 | A
  4 | 10 | B
  2 | 20 | B
  6 | 30 | B
  8 | 60 | B

It gives the following results

 id | min_ts | max_ts | grp 
----+--------+--------+-----
  7 |      5 |     30 | A
  4 |     10 |     60 | B

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM