简体   繁体   中英

POSTGRESQL : How to select the first row of each group?

With this query :

WITH responsesNew AS
(
  SELECT DISTINCT responses."studentId", notation, responses."givenHeart", 
  SUM(notation + responses."givenHeart") OVER (partition BY responses."studentId" 
  ORDER BY responses."createdAt") AS total, responses."createdAt",  
  FROM responses
)
SELECT responsesNew."studentId", notation, responsesNew."givenHeart", total, 
responsesNew."createdAt"
FROM responsesNew
WHERE total = 3
GROUP BY responsesNew."studentId", notation, responsesNew."givenHeart", total, 
responsesNew."createdAt"
ORDER BY responsesNew."studentId" ASC

I get this data table :

studentId | notation | givenHeart | total |      createdAt     |
----------+----------+------------+-------+--------------------+
 374      | 1        | 0          | 3     | 2017-02-13 12:43:03   
 374      | null     | 0          | 3     | 2017-02-15 22:22:17
 639      | 1        | 2          | 3     | 2017-04-03 17:21:30 
 790      | 1        | 0          | 3     | 2017-02-12 21:12:23
 ...

My goal is to keep only in my data table the early row of each group like shown below :

studentId | notation | givenHeart | total |      createdAt     |
----------+----------+------------+-------+--------------------+
 374      | 1        | 0          | 3     | 2017-02-13 12:43:03 
 639      | 1        | 2          | 3     | 2017-04-03 17:21:30 
 790      | 1        | 0          | 3     | 2017-02-12 21:12:23
 ...

How can I get there?

I've read many topics over here but nothing I've tried with DISTINCT , DISTINCT ON , subqueries in WHERE , LIMIT , etc have worked for me (surely due to my poor understanding). I've met errors related to window function, missing column in ORDER BY and a few others I can't remember.

You can do this with distinct on . The query would look like this:

WITH responsesNew AS (
      SELECT DISTINCT r."studentId", notation, r."givenHeart", 
             SUM(notation + r."givenHeart") OVER (partition BY r."studentId" 
                                                  ORDER BY r."createdAt") AS total,
             r."createdAt" 
      FROM responses r
     )
SELECT DISTINCT ON (r."studentId") r."studentId", notation, r."givenHeart", total, 
r."createdAt"
FROM responsesNew r
WHERE total = 3
ORDER BY r."studentId" ASC, r."createdAt";

I'm pretty sure this can be simplified. I just don't understand the purpose of the CTE. Using SELECT DISTINCT in this way is very curious.

If you want a simplified query, ask another question with sample data, desired results, and explanation of what you are doing and include the query or a link to this question.

use Row_number() window function to add a row number to each partition and then only show row 1.

no need to fully qualify names if only one table is involved. and use aliases when qualifying to simplify readability.

WITH responsesNew AS
(
  SELECT "studentId"
       , notation
       , "givenHeart"
       , SUM(notation + "givenHeart") OVER (partition BY "studentId" ORDER BY "createdAt") AS total
       , "createdAt"
       , Row_number() OVER ("studentId" ORDER BY "createdAt") As RNum
  FROM responses r
)
SELECT RN."studentId"
     , notation, RN."givenHeart"
     , total
     , RN."createdAt"
FROM responsesNew RN
WHERE total = 3
  AND RNum = 1
GROUP BY RN."studentId"
       , notation
       , RN."givenHeart", total
       , RN."createdAt"
ORDER BY RN."studentId" ASC

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM