在每個 GROUP BY 組中選擇第一行？

Question

正如標題所示，我想選擇與GROUP BY分組的每組行的第一行。

具體來說，如果我有一個如下所示的purchases表：

SELECT * FROM purchases;

我的輸出：

ID	顧客	全部的
1	喬	5
2	莎莉	3
3	喬	2
4	莎莉	1

我想查詢每個customer進行的最大購買（ total ）的id 。 像這樣的東西：

SELECT FIRST(id), customer, FIRST(total)
FROM  purchases
GROUP BY customer
ORDER BY total DESC;

預期輸出：

第一（身份證）	顧客	第一（總）
1	喬	5
2	莎莉	3

Answer 1

在PostgreSQL中， DISTINCT ON通常是最簡單和最快的。
_{（有關某些工作負載的性能優化，請參見下文。）}

SELECT DISTINCT ON (customer)
       id, customer, total
FROM   purchases
ORDER  BY customer, total DESC, id;

或者更短（如果不是很清楚）輸出列的序數：

SELECT DISTINCT ON (2)
       id, customer, total
FROM   purchases
ORDER  BY 2, 3 DESC, 1;

如果total可以為 NULL，則添加NULLS LAST ：

...
ORDER  BY customer, total DESC NULLS LAST, id;

無論哪種方式都可以，但您需要匹配現有索引

db<> 在這里擺弄

要點

DISTINCT ON是標准的 PostgreSQL 擴展，其中僅定義了整個SELECT列表上的DISTINCT 。

在DISTINCT ON子句中列出任意數量的表達式，組合的行值定義重復項。 手冊：

顯然，如果兩行在至少一個列值上不同，則它們被認為是不同的。 在此比較中，空值被視為相等。

大膽強調我的。

DISTINCT ON可以與ORDER BY結合使用。 ORDER BY中的前導表達式必須在DISTINCT ON的表達式集中，但您可以自由地重新排列它們之間的順序。 例子。
您可以向ORDER BY添加其他表達式以從每組對等點中選擇特定行。 或者，正如手冊所說：

DISTINCT ON表達式必須匹配最左邊的ORDER BY表達式。 ORDER BY子句通常包含附加表達式，這些表達式確定每個DISTINCT ON組中行的所需優先級。

我添加了id作為最后一項來打破關系：
“從每個共享最高total的組中選擇id最小的行。”

要以與確定每組第一個的排序順序不一致的方式對結果進行排序，您可以將上述查詢嵌套在具有另一個ORDER BY的外部查詢中。 例子。

如果total可以為 NULL，則您很可能需要具有最大非空值的行。 像演示的那樣添加NULLS LAST 。 看：

按 ASC 列排序，但首先是 NULL 值？

SELECT列表不受DISTINCT ON或ORDER BY中的表達式以任何方式約束：

您不必在DISTINCT ON或ORDER BY中包含任何表達式。
您可以在SELECT列表中包含任何其他表達式。 這有助於替換復雜的子查詢和聚合/窗口函數。

我使用 Postgres 版本 8.3 - 15 進行了測試。但該功能至少從版本 7.1 開始就存在，所以基本上一直如此。

指數

上述查詢的完美索引將是一個跨所有三列的多列索引，以匹配的順序和匹配的排序順序：

CREATE INDEX purchases_3c_idx ON purchases (customer, total DESC, id);

可能太專業了。 但如果特定查詢的讀取性能至關重要，請使用它。 如果查詢中有DESC NULLS LAST ，請在索引中使用相同的值，以便排序順序匹配並且索引完全適用。

有效性/性能優化

在為每個查詢創建定制索引之前權衡成本和收益。 上述指標的潛力很大程度上取決於數據分布。

使用索引是因為它提供了預排序的數據。 在 Postgres 9.2 或更高版本中，如果索引小於基礎表，查詢也可以從僅索引掃描中受益。 但是，必須完整地掃描索引。 例子。

對於每個客戶的幾行（ customer列中的高基數），這是非常有效的。 如果您仍然需要排序輸出，則更是如此。 隨着每個客戶的行數增加，收益會縮小。
理想情況下，您有足夠的work_mem來處理 RAM 中涉及的排序步驟，而不會溢出到磁盤。 但通常將work_mem設置得太高會產生不利影響。 對於異常大的查詢，請考慮使用SET LOCAL 。 使用EXPLAIN ANALYZE找出您需要多少。 在排序步驟中提到“磁盤： ”表示需要更多：

對於每個客戶的許多行（ customer列中的低基數），松散的索引掃描（又名“跳過掃描”）會（很多）更有效，但是直到 Postgres 14 才實現。（僅索引掃描的實現是正在為 Postgres 15 開發。請參見此處和此處。）
目前，有更快的查詢技術可以替代它。 特別是如果您有一個單獨的表來保存唯一客戶，這是典型的用例。 但如果你不這樣做：

基准

請參閱單獨的答案。

Answer 2

在支持 CTE 和窗口函數的數據庫上：

WITH summary AS (
    SELECT p.id, 
           p.customer, 
           p.total, 
           ROW_NUMBER() OVER(PARTITION BY p.customer 
                                 ORDER BY p.total DESC) AS rank
      FROM PURCHASES p)
 SELECT *
   FROM summary
 WHERE rank = 1

任何數據庫都支持：

但是您需要添加邏輯來打破關系：

  SELECT MIN(x.id),  -- change to MAX if you want the highest
         x.customer, 
         x.total
    FROM PURCHASES x
    JOIN (SELECT p.customer,
                 MAX(total) AS max_total
            FROM PURCHASES p
        GROUP BY p.customer) y ON y.customer = x.customer
                              AND y.max_total = x.total
GROUP BY x.customer, x.total

Answer 3

基准

我測試了最有趣的候選人：

最初使用Postgres 9.4和9.5 。
稍后為Postgres 13添加了重音測試。

基本測試設置

主表： purchases ：

CREATE TABLE purchases (
  id          serial  -- PK constraint added below
, customer_id int     -- REFERENCES customer
, total       int     -- could be amount of money in Cent
, some_column text    -- to make the row bigger, more realistic
);

虛擬數據（帶有一些死元組）、PK、索引：

INSERT INTO purchases (customer_id, total, some_column)    -- 200k rows
SELECT (random() * 10000)::int             AS customer_id  -- 10k distinct customers
     , (random() * random() * 100000)::int AS total     
     , 'note: ' || repeat('x', (random()^2 * random() * random() * 500)::int)
FROM   generate_series(1,200000) g;

ALTER TABLE purchases ADD CONSTRAINT purchases_id_pkey PRIMARY KEY (id);

DELETE FROM purchases WHERE random() > 0.9;  -- some dead rows

INSERT INTO purchases (customer_id, total, some_column)
SELECT (random() * 10000)::int             AS customer_id  -- 10k customers
     , (random() * random() * 100000)::int AS total     
     , 'note: ' || repeat('x', (random()^2 * random() * random() * 500)::int)
FROM   generate_series(1,20000) g;  -- add 20k to make it ~ 200k

CREATE INDEX purchases_3c_idx ON purchases (customer_id, total DESC, id);

VACUUM ANALYZE purchases;

customer表 - 用於優化查詢：

CREATE TABLE customer AS
SELECT customer_id, 'customer_' || customer_id AS customer
FROM   purchases
GROUP  BY 1
ORDER  BY 1;

ALTER TABLE customer ADD CONSTRAINT customer_customer_id_pkey PRIMARY KEY (customer_id);

VACUUM ANALYZE customer;

在我對 9.5 的第二次測試中，我使用了相同的設置，但使用 100000 個不同的customer_id來獲得每個customer_id的幾行。

餐桌`purchases`的對象大小

基本設置： purchases 200k 行，10k 不同customer_id ，平均。 每個客戶 20 行。
對於 Postgres 9.5，我添加了第二個測試，有 86446 個不同的客戶 - 平均。 每個客戶 2.3 行。

使用從此處獲取的查詢生成：

測量 PostgreSQL 表行的大小

為 Postgres 9.5 收集：

               what                | bytes/ct | bytes_pretty | bytes_per_row
-----------------------------------+----------+--------------+---------------
 core_relation_size                | 20496384 | 20 MB        |           102
 visibility_map                    |        0 | 0 bytes      |             0
 free_space_map                    |    24576 | 24 kB        |             0
 table_size_incl_toast             | 20529152 | 20 MB        |           102
 indexes_size                      | 10977280 | 10 MB        |            54
 total_size_incl_toast_and_indexes | 31506432 | 30 MB        |           157
 live_rows_in_text_representation  | 13729802 | 13 MB        |            68
 ------------------------------    |          |              |
 row_count                         |   200045 |              |
 live_tuples                       |   200045 |              |
 dead_tuples                       |    19955 |              |

查詢

1. CTE 中的`row_number()` ，（見其他答案）

WITH cte AS (
   SELECT id, customer_id, total
        , row_number() OVER (PARTITION BY customer_id ORDER BY total DESC) AS rn
   FROM   purchases
   )
SELECT id, customer_id, total
FROM   cte
WHERE  rn = 1;

2. 子查詢中的`row_number()` （我的優化）

SELECT id, customer_id, total
FROM   (
   SELECT id, customer_id, total
        , row_number() OVER (PARTITION BY customer_id ORDER BY total DESC) AS rn
   FROM   purchases
   ) sub
WHERE  rn = 1;

3. `DISTINCT ON` （見其他答案）

SELECT DISTINCT ON (customer_id)
       id, customer_id, total
FROM   purchases
ORDER  BY customer_id, total DESC, id;

4. 帶有`LATERAL`子查詢的 rCTE（見這里）

WITH RECURSIVE cte AS (
   (  -- parentheses required
   SELECT id, customer_id, total
   FROM   purchases
   ORDER  BY customer_id, total DESC
   LIMIT  1
   )
   UNION ALL
   SELECT u.*
   FROM   cte c
   ,      LATERAL (
      SELECT id, customer_id, total
      FROM   purchases
      WHERE  customer_id > c.customer_id  -- lateral reference
      ORDER  BY customer_id, total DESC
      LIMIT  1
      ) u
   )
SELECT id, customer_id, total
FROM   cte
ORDER  BY customer_id;

5. 帶有`LATERAL`的`customer`表（見這里）

SELECT l.*
FROM   customer c
,      LATERAL (
   SELECT id, customer_id, total
   FROM   purchases
   WHERE  customer_id = c.customer_id  -- lateral reference
   ORDER  BY total DESC
   LIMIT  1
   ) l;

6. 帶有`ORDER BY`的`array_agg()` （見其他答案）

SELECT (array_agg(id ORDER BY total DESC))[1] AS id
     , customer_id
     , max(total) AS total
FROM   purchases
GROUP  BY customer_id;

結果

上述帶有EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF ，最好是 5 次運行以與暖緩存進行比較。

所有查詢都在purchases2_3c_idx上使用了僅索引掃描（以及其他步驟）。 有些只是從較小的索引中受益，有些則更有效。

A. Postgres 9.4 有 200k 行，每個`customer_id`約 20

1. 273.274 ms  
2. 194.572 ms  
3. 111.067 ms  
4.  92.922 ms  -- !
5.  37.679 ms  -- winner
6. 189.495 ms

B. 與 A. 相同，使用 Postgres 9.5

1. 288.006 ms
2. 223.032 ms  
3. 107.074 ms  
4.  78.032 ms  -- !
5.  33.944 ms  -- winner
6. 211.540 ms

C. 與 B. 相同，但每個`customer_id`大約 2.3 行

1. 381.573 ms
2. 311.976 ms
3. 124.074 ms  -- winner
4. 710.631 ms
5. 311.976 ms
6. 421.679 ms

在 2021 年 8 月 11 日使用 Postgres 13 重新測試

簡化的測試設置：沒有刪除的行，因為VACUUM ANALYZE為簡單的情況完全清理了表。

Postgres 的重要變化：

一般性能改進。
CTE 從 Postgres 12 開始可以內聯，因此查詢 1. 和 2. 現在執行幾乎相同（相同的查詢計划）。

D. 像 B. ~ 每個 customer_id 20 行

1. 103 ms
2. 103 ms  
3.  23 ms  -- winner  
4.  71 ms  
5.  22 ms  -- winner
6.  81 ms

db<> 在這里擺弄

E. 像 C. ~ 每個 customer_id 2.3 行

1. 127 ms
2. 126 ms  
3.  36 ms  -- winner  
4. 620 ms  
5. 145 ms
6. 203 ms

db<> 在這里擺弄

使用 Postgres 13 進行重音測試

1M 行，每個客戶 10.000 對 100 對 1.6 行。

F. 每個客戶約 10.000 行

1. 526 ms
2. 527 ms  
3. 127 ms
4.   2 ms  -- winner !
5.   1 ms  -- winner !
6. 356 ms

db<> 在這里擺弄

G. 每個客戶約 100 行

1. 535 ms
2. 529 ms  
3. 132 ms
4. 108 ms  -- !
5.  71 ms  -- winner
6. 376 ms

db<> 在這里擺弄

H. 每個客戶約 1.6 行

1.  691 ms
2.  684 ms  
3.  234 ms  -- winner
4. 4669 ms
5. 1089 ms
6. 1264 ms

db<> 在這里擺弄

結論

DISTINCT ON有效地使用索引，並且通常對每組的幾行執行最佳。 即使每組有很多行，它也表現得很好。
對於每組的許多行，使用 rCTE 模擬索引跳過掃描的效果最好 - 僅次於使用單獨的查找表（如果可用）的查詢技術。
當前接受的答案中展示的row_number()技術從未贏得任何性能測試。 那時不是，現在不是。 它永遠不會接近DISTINCT ON ，即使數據分布不利於后者。 row_number()唯一的好處是：它的擴展性不是很大，只是平庸。

更多基准

在Postgres 11.5上以1000 萬行和 60k 唯一“客戶”為基准進行基准測試。 結果與我們迄今為止所看到的一致：

訪問每個標識符的最新行的正確方法？

2011 年的原始（過時）基准

我在 65579 行的真實表上使用 PostgreSQL 9.1運行了三個測試，並且在所涉及的三列中的每一列上都有單列 btree 索引，並獲得了 5 次運行的最佳執行時間。
將@OMGPonies 的第一個查詢 ( A ) 與上述DISTINCT ON解決方案( B ) 進行比較：

選擇整個表，在這種情況下會產生 5958 行。

A: 567.218 ms
B: 386.673 ms

使用條件WHERE customer BETWEEN x AND y產生 1000 行。

A: 249.136 ms
B:  55.111 ms

使用WHERE customer = x選擇一個客戶。

A:   0.143 ms
B:   0.072 ms

使用另一個答案中描述的索引重復相同的測試：

CREATE INDEX purchases_3c_idx ON purchases (customer, total DESC, id);

1A: 277.953 ms  
1B: 193.547 ms

2A: 249.796 ms -- special index not used  
2B:  28.679 ms

3A:   0.120 ms  
3B:   0.048 ms

Answer 4

這是一個常見的每組最大 n問題，已經有經過良好測試和高度優化的解決方案。 就我個人而言，我更喜歡Bill Karwin 的左連接解決方案（原始帖子有很多其他解決方案）。

請注意，可以在最官方的來源之一MySQL 手冊中找到許多解決此常見問題的方法！ 請參閱常見查詢示例 :: The Rows Hold the Group-wise Maximum of a certain Column 。

Answer 5

在 Postgres 中，您可以像這樣使用array_agg ：

SELECT  customer,
        (array_agg(id ORDER BY total DESC))[1],
        max(total)
FROM purchases
GROUP BY customer

這將為您提供每個客戶最大購買量的id 。

需要注意的一些事項：

array_agg是一個聚合函數，因此它適用於GROUP BY 。
array_agg允許您指定一個僅限於自身的排序，因此它不會限制整個查詢的結構。 如果您需要執行與默認值不同的操作，還有用於對 NULL 進行排序的語法。
一旦我們構建了數組，我們就獲取第一個元素。 （Postgres 數組是 1 索引的，而不是 0 索引的）。
您可以以類似的方式對第三個輸出列使用array_agg ，但max(total)更簡單。
與DISTINCT ON不同，使用array_agg可以讓您保留GROUP BY ，以防您出於其他原因需要。

Answer 6

正如 Erwin 所指出的那樣，該解決方案不是很有效，因為存在 SubQ

select * from purchases p1 where total in
(select max(total) from purchases where p1.customer=customer) order by total desc;

Answer 7

查詢：

SELECT purchases.*
FROM purchases
LEFT JOIN purchases as p 
ON 
  p.customer = purchases.customer 
  AND 
  purchases.total < p.total
WHERE p.total IS NULL

這是如何運作的！ （我去過那兒）

我們希望確保每次購買只有最高的總數。

一些理論知識（如果您只想了解查詢，請跳過此部分）

讓 Total 是一個函數 T(customer,id)，它返回一個給定名稱和 id 的值為了證明給定的總數 (T(customer,id)) 是最高的，我們必須證明我們想要證明

∀x T(customer,id) > T(customer,x)（這個總數高於該客戶的所有其他總數）

或者

¬∃x T(customer, id) < T(customer, x) （該客戶沒有更高的總數）

第一種方法需要我們獲取我不太喜歡的那個名字的所有記錄。

第二個需要一種聰明的方式來說明沒有比這個更高的記錄。

返回 SQL

如果我們離開連接表的名稱和總數小於連接表：

LEFT JOIN purchases as p 
ON 
p.customer = purchases.customer 
AND 
purchases.total < p.total

我們確保對於要加入的同一用戶，所有具有另一條總數較高的記錄的記錄：

+--------------+---------------------+-----------------+------+------------+---------+
| purchases.id |  purchases.customer | purchases.total | p.id | p.customer | p.total |
+--------------+---------------------+-----------------+------+------------+---------+
|            1 | Tom                 |             200 |    2 | Tom        |     300 |
|            2 | Tom                 |             300 |      |            |         |
|            3 | Bob                 |             400 |    4 | Bob        |     500 |
|            4 | Bob                 |             500 |      |            |         |
|            5 | Alice               |             600 |    6 | Alice      |     700 |
|            6 | Alice               |             700 |      |            |         |
+--------------+---------------------+-----------------+------+------------+---------+

這將幫助我們過濾每次購買的最高總數，而無需分組：

WHERE p.total IS NULL
    
+--------------+----------------+-----------------+------+--------+---------+
| purchases.id | purchases.name | purchases.total | p.id | p.name | p.total |
+--------------+----------------+-----------------+------+--------+---------+
|            2 | Tom            |             300 |      |        |         |
|            4 | Bob            |             500 |      |        |         |
|            6 | Alice          |             700 |      |        |         |
+--------------+----------------+-----------------+------+--------+---------+

這就是我們需要的答案。

Answer 8

我使用這種方式（僅限 postgresql）： https ://wiki.postgresql.org/wiki/First/last_%28aggregate%29

-- Create a function that always returns the first non-NULL item
CREATE OR REPLACE FUNCTION public.first_agg ( anyelement, anyelement )
RETURNS anyelement LANGUAGE sql IMMUTABLE STRICT AS $$
        SELECT $1;
$$;

-- And then wrap an aggregate around it
CREATE AGGREGATE public.first (
        sfunc    = public.first_agg,
        basetype = anyelement,
        stype    = anyelement
);

-- Create a function that always returns the last non-NULL item
CREATE OR REPLACE FUNCTION public.last_agg ( anyelement, anyelement )
RETURNS anyelement LANGUAGE sql IMMUTABLE STRICT AS $$
        SELECT $2;
$$;

-- And then wrap an aggregate around it
CREATE AGGREGATE public.last (
        sfunc    = public.last_agg,
        basetype = anyelement,
        stype    = anyelement
);

那么您的示例應該幾乎可以正常工作：

SELECT FIRST(id), customer, FIRST(total)
FROM  purchases
GROUP BY customer
ORDER BY FIRST(total) DESC;

警告：它忽略了 NULL 行

編輯 1 - 改用 postgres 擴展

現在我用這種方式：http: //pgxn.org/dist/first_last_agg/

在 ubuntu 14.04 上安裝：

apt-get install postgresql-server-dev-9.3 git build-essential -y
git clone git://github.com/wulczer/first_last_agg.git
cd first_last_app
make && sudo make install
psql -c 'create extension first_last_agg'

這是一個 postgres 擴展，為您提供 first 和 last 功能； 顯然比上述方式更快。

編輯 2 - 排序和過濾

如果您使用聚合函數（如這些），則可以對結果進行排序，而無需對數據進行排序：

http://www.postgresql.org/docs/current/static/sql-expressions.html#SYNTAX-AGGREGATES

因此，具有排序的等效示例將類似於：

SELECT first(id order by id), customer, first(total order by id)
  FROM purchases
 GROUP BY customer
 ORDER BY first(total);

當然，您可以在聚合中按照您認為合適的方式訂購和過濾； 這是非常強大的語法。

Answer 9

對PostgreSQL 、 U-SQL 、 IBM DB2和Google BigQuery SQL使用ARRAY_AGG函數：

SELECT customer, (ARRAY_AGG(id ORDER BY total DESC))[1], MAX(total)
FROM purchases
GROUP BY customer

Answer 10

在 SQL Server 中，您可以這樣做：

SELECT *
FROM (
SELECT ROW_NUMBER()
OVER(PARTITION BY customer
ORDER BY total DESC) AS StRank, *
FROM Purchases) n
WHERE StRank = 1

說明：這里的 Group by是根據客戶完成的，然后按總數排序，然后每個這樣的組都被賦予序列號為 StRank，我們取出前 1 個 StRank 為 1 的客戶

Answer 11

非常快速的解決方案

SELECT a.* 
FROM
    purchases a 
    JOIN ( 
        SELECT customer, min( id ) as id 
        FROM purchases 
        GROUP BY customer 
    ) b USING ( id );

如果表是由 id 索引的，那么真的非常快：

create index purchases_id on purchases (id);

Answer 12

Snowflake/Teradata 支持QUALIFY子句，其作用類似於窗口函數的HAVING ：

SELECT id, customer, total
FROM PURCHASES
QUALIFY ROW_NUMBER() OVER(PARTITION BY p.customer ORDER BY p.total DESC) = 1

Answer 13

在 PostgreSQL 中，另一種可能性是將first_value窗口函數與SELECT DISTINCT結合使用：

select distinct customer_id,
                first_value(row(id, total)) over(partition by customer_id order by total desc, id)
from            purchases;

我創建了一個復合(id, total) ，因此兩個值都由同一個聚合返回。 您當然可以始終應用first_value()兩次。

Answer 14

這樣它對我有用：

SELECT article, dealer, price
FROM   shop s1
WHERE  price=(SELECT MAX(s2.price)
              FROM shop s2
              WHERE s1.article = s2.article
              GROUP BY s2.article)
ORDER BY article;

選擇每篇文章的最高價格

Answer 15

這就是我們如何通過使用 windows 函數來實現這一點：

    create table purchases (id int4, customer varchar(10), total integer);
    insert into purchases values (1, 'Joe', 5);
    insert into purchases values (2, 'Sally', 3);
    insert into purchases values (3, 'Joe', 2);
    insert into purchases values (4, 'Sally', 1);
    
    select ID, CUSTOMER, TOTAL from (
    select ID, CUSTOMER, TOTAL,
    row_number () over (partition by CUSTOMER order by TOTAL desc) RN
    from purchases) A where RN = 1;

Answer 16

從我的測試來看，公認的 OMG Ponies 的“任何數據庫支持”解決方案的速度都很好。

在這里，我提供了一種相同的方法，但更完整和更干凈的任何數據庫解決方案。 考慮關系（假設希望為每個客戶只獲取一行，甚至為每個客戶的最大總數獲取多條記錄），並且將為購買表中的實際匹配行選擇其他購買字段（例如 purchase_payment_id）。

任何數據庫都支持：

select * from purchase
join (
    select min(id) as id from purchase
    join (
        select customer, max(total) as total from purchase
        group by customer
    ) t1 using (customer, total)
    group by customer
) t2 using (id)
order by customer

此查詢相當快，尤其是當購買表上有一個復合索引（例如（客戶，總計）時）。

評論：

t1, t2 是子查詢別名，可以根據數據庫刪除。
警告：截至 2017 年 1 月的此編輯，MS-SQL 和 Oracle db 目前不支持using (...)子句。您必須自己將其擴展為例如on t2.id = purchase.id等。 USING 語法適用於 SQLite、MySQL 和 PostgreSQL。

Answer 17

如果您想從聚合行集中選擇任何（根據您的特定條件）行。
如果您想使用除max/min之外的另一個（ sum/avg ）聚合函數。 因此，您不能使用DISTINCT ON的線索

您可以使用下一個子查詢：

SELECT  
    (  
       SELECT **id** FROM t2   
       WHERE id = ANY ( ARRAY_AGG( tf.id ) ) AND amount = MAX( tf.amount )   
    ) id,  
    name,   
    MAX(amount) ma,  
    SUM( ratio )  
FROM t2  tf  
GROUP BY name

您可以將amount = MAX( tf.amount )替換為您想要的任何條件，但有一個限制：此子查詢不得返回多於一行

但是如果你想做這樣的事情，你可能會尋找窗口功能

Answer 18

對於 SQl Server，最有效的方法是：

with
ids as ( --condition for split table into groups
    select i from (values (9),(12),(17),(18),(19),(20),(22),(21),(23),(10)) as v(i) 
) 
,src as ( 
    select * from yourTable where  <condition> --use this as filter for other conditions
)
,joined as (
    select tops.* from ids 
    cross apply --it`s like for each rows
    (
        select top(1) * 
        from src
        where CommodityId = ids.i 
    ) as tops
)
select * from joined

並且不要忘記為使用的列創建聚集索引

Answer 19

我通過窗口函數dbfiddle的方法：

在每個組中分配row_number ： row_number() over (partition by agreement_id, order_id ) as nrow
僅取組中的第一行： filter (where nrow = 1)

with intermediate as (select 
 *,
 row_number() over ( partition by agreement_id, order_id ) as nrow,
 (sum( suma ) over ( partition by agreement_id, order_id ))::numeric( 10, 2) as order_suma,
from <your table>)

select 
  *,
  sum( order_suma ) filter (where nrow = 1) over (partition by agreement_id)
from intermediate

Answer 20

這可以通過對 total 和 GROUP BY id 和 customer 的 MAX FUNCTION 輕松實現。

SELECT id, customer, MAX(total) FROM  purchases GROUP BY id, customer
ORDER BY total DESC;

在每個 GROUP BY 組中選擇第一行？

問題描述

20 個解決方案

解決方案1 1518 2011-10-03 02:21:52

要點

指數

有效性/性能優化

基准

解決方案2 1425 已采納 2010-09-27 01:27:54

在支持 CTE 和窗口函數的數據庫上：

任何數據庫都支持：

解決方案3 227 2016-01-11 06:05:43

基准

基本測試設置

餐桌purchases的對象大小

查詢

1. CTE 中的row_number() ，（見其他答案）

2. 子查詢中的row_number() （我的優化）

3. DISTINCT ON （見其他答案）

4. 帶有LATERAL子查詢的 rCTE（見這里）

5. 帶有LATERAL的customer表（見這里）

6. 帶有ORDER BY的array_agg() （見其他答案）

結果

A. Postgres 9.4 有 200k 行，每個customer_id約 20

B. 與 A. 相同，使用 Postgres 9.5

C. 與 B. 相同，但每個customer_id大約 2.3 行

在 2021 年 8 月 11 日使用 Postgres 13 重新測試

D. 像 B. ~ 每個 customer_id 20 行

E. 像 C. ~ 每個 customer_id 2.3 行

使用 Postgres 13 進行重音測試

F. 每個客戶約 10.000 行

G. 每個客戶約 100 行

H. 每個客戶約 1.6 行

結論

更多基准

2011 年的原始（過時）基准

解決方案4 66 2013-06-27 08:38:44

解決方案5 41 2014-08-27 18:14:26

解決方案6 17 2013-06-17 18:02:04

解決方案7 17 2018-03-24 16:11:27

解決方案8 11 2015-03-10 15:19:50

編輯 1 - 改用 postgres 擴展

編輯 2 - 排序和過濾

解決方案9 11 2019-04-04 20:54:36

解決方案10 10 2018-12-29 16:12:47

解決方案11 8 2014-04-08 16:13:33

解決方案12 5 2019-11-17 21:19:50

解決方案13 5

解決方案14 5 2020-07-17 03:40:03

解決方案15 5 2022-02-07 04:30:32

解決方案16 4 2017-01-04 15:47:37

解決方案17 2 2018-09-28 13:50:40

解決方案18 2 2019-01-18 10:59:03

解決方案19 0 2021-05-13 13:18:49

解決方案20 -1 2021-12-16 09:43:22

解決方案1
1518 2011-10-03 02:21:52

解決方案2
1425 已采納 2010-09-27 01:27:54

解決方案3
227 2016-01-11 06:05:43

餐桌`purchases`的對象大小

1. CTE 中的`row_number()` ，（見其他答案）

2. 子查詢中的`row_number()` （我的優化）

3. `DISTINCT ON` （見其他答案）

4. 帶有`LATERAL`子查詢的 rCTE（見這里）

5. 帶有`LATERAL`的`customer`表（見這里）

6. 帶有`ORDER BY`的`array_agg()` （見其他答案）

A. Postgres 9.4 有 200k 行，每個`customer_id`約 20

C. 與 B. 相同，但每個`customer_id`大約 2.3 行

解決方案4
66 2013-06-27 08:38:44

解決方案5
41 2014-08-27 18:14:26

解決方案6
17 2013-06-17 18:02:04

解決方案7
17 2018-03-24 16:11:27

解決方案8
11 2015-03-10 15:19:50

解決方案9
11 2019-04-04 20:54:36

解決方案10
10 2018-12-29 16:12:47

解決方案11
8 2014-04-08 16:13:33

解決方案12
5 2019-11-17 21:19:50

解決方案13
5

解決方案14
5 2020-07-17 03:40:03

解決方案15
5 2022-02-07 04:30:32

解決方案16
4 2017-01-04 15:47:37

解決方案17
2 2018-09-28 13:50:40

解決方案18
2 2019-01-18 10:59:03

解決方案19
0 2021-05-13 13:18:49

解決方案20
-1 2021-12-16 09:43:22