GROUP BY to return entire row

Question

I want to return the most recent ten (complete) rows from a table, but ensuring they all have unique 'alternate_id's. If there are two in the top ten with the same alternate_id I don't care which I take, but I want one of them: I thought I'd use group by as follows:

select * 
from track_history_items 
where type='FooType' 
group by alternate_id order by created_at desc limit 10;

but this seems to be causing problems (failing to return rows with alternate_ids that are definitely in the top 10). Any ideas how I should do this properly?

* SOLUTION * (I can't post an answer as I'm a new user)

here's what I've ended up doing:

SELECT field1,
...,
max(created_at),
...,
fieldN
FROM track_history_items
where type='FooType'
group by alternate_id
order by created_at desc
limit 10

This seems to do the trick. Thought I'd post it here in case it's of use to others, or there are any mistakes with it that someone might spot!

Answer 1

GROUP BY must be used with aggregate functions (like sum, avg, ...).

You can use:

SELECT
  DISTINCT alternate_id,
  field1, -- one value per alternate_id
  ...,    -- one value per alternate_id
  fieldn  -- one value per alternate_id
FROM
  track_history_items
WHERE
  type = 'FooType'
ORDER BY
  created_at DESC
LIMIT 10

This is standard SQL.

It does not mean you will necessarily unique value in your alternat_id column. You will have every combinations of {alternate_id, fieldi}.

Answer 2

In SQLite if you do do SELECT max(anything) GROUP BY alternate_id , magic SQLite extension behavior guarantees that you will get the full row accessible as mentioned at: SQLite - SELECT DISTINCT of one column and get the others which is why your query would work on SQLite. The order by created_at desc can be removed:

SELECT max(created_at), field1, fieldN
FROM track_history_items
where type='FooType'
group by alternate_id
limit 10

It would however fail on PostgreSQL.

On PostgreSQL you could use SELECT DISTINCT ON(alternate_id) . Unlike just SELECT DISTINCT (without ON as mentioned at https://stackoverflow.com/a/5803116/895245 ), the ON PostgreSQL extension allows you to separately select "what you want to get back" and "what you want to be unique". Without ON , you get unique full tuples, which could repeat alternate_id :

SELECT DISTINCT ON(alternate_id),
  "field1",
  "fieldN"
FROM track_history_items
where type='FooType'
order by created_at desc
limit 10

See also: What is the difference between Postgres DISTINCT vs DISTINCT ON?

Finally, another option is to use the ROW_NUMBER window function, which now works tested on both PostgreSQL 14.3 and SQLite 3.34.0 as in:

SELECT *
FROM (
    SELECT
      ROW_NUMBER() OVER (
        PARTITION BY "alternate_id"
        ORDER BY "created_at" DESC
      ) AS "rnk",
      *
    FROM "track_history_items"
    WHERE "name" IN ('a', 'b', 'c')
  ) sub
WHERE
  "sub"."rnk" = 1
ORDER BY
  "sub"."name" ASC,
  "sub"."population" DESC

As mentioned at: https://stackoverflow.com/a/71924314/895245 that form allows for great versatility.

GROUP BY to return entire row

Question

2 answers

solution1
0 ACCPTED 2011-04-27 11:13:02

solution2
0 2022-07-15 16:35:55

GROUP BY to return entire row

Question

2 answers

solution1 0 ACCPTED 2011-04-27 11:13:02

solution2 0 2022-07-15 16:35:55

solution1
0 ACCPTED 2011-04-27 11:13:02

solution2
0 2022-07-15 16:35:55