[英]How can I use PostgreSQL's DISTINCT ON clause to also return a count of the duplicates?
假設我有一張這樣的桌子
+--------+--------+------+--------+---------+
| A | B | C | g | h |
+--------+--------+------+--------+---------+
| cat | dog | bird | 34.223 | 54.223 |
| cat | pigeon | goat | 23.23 | 54.948 |
| cat | dog | bird | 17.386 | 26.398 |
| gopher | pigeon | bird | 23.552 | 89.223 |
+--------+--------+------+--------+---------+
但右邊還有更多字段(i,j,k,...)。
我需要一個結果表,如下所示:
+-----+--------+------+-----+-----+-----+-----+-------+
| A | B | C | g | h | ... | z | count |
+-----+--------+------+-----+-----+-----+-----+-------+
| cat | dog | bird | xxx | xxx | | xxx | 23 |
| cat | pigeon | goat | xxx | xxx | | xxx | 78 |
+-----+--------+------+-----+-----+-----+-----+-------+
我通常使用GROUP BY,但是我不想重復所有的列名(g,h,i,... z)。
我目前可以使用結合DISTINCT ON的窗口函數來獲得所需的結果,但是查詢的運行速度非常慢(超過500k條記錄),並且重復項很多
WITH temp AS (
SELECT a, b, c, COUNT(*)
FROM my_table
GROUP BY a, b, C
)
SELECT DISTINCT ON (a, b, c) *, (
SELECT count
FROM temp
WHERE
temp.a = t.a
AND temp.b = t.b
AND temp.c = t.c
) as count
FROM my_table as t
ORDER BY a, b, c, x, y;
有沒有辦法以某種更有效的方式獲得用DISTINCT消除的行數? 就像是
SELECT DISTINCT ON (a, b, c)
*, COUNT(*)
FROM my_table
ORDER BY a, b, c, count;
還是我采用了錯誤的方法?
將COUNT()
與PARTITION BY
:
SELECT DISTINCT ON (a, b, c) *, COUNT(*) OVER (PARTITION BY a, b, c)
FROM my_table
如果您根本不關心其余字段,則可能還應該在查詢中添加ORDER,否則用於獲取這些字段中顯示的數據的行可能會不一致。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.