Suppose I have a table like this
+--------+--------+------+--------+---------+
| A | B | C | g | h |
+--------+--------+------+--------+---------+
| cat | dog | bird | 34.223 | 54.223 |
| cat | pigeon | goat | 23.23 | 54.948 |
| cat | dog | bird | 17.386 | 26.398 |
| gopher | pigeon | bird | 23.552 | 89.223 |
+--------+--------+------+--------+---------+
but with many more fields to the right (i, j, k, ...).
I need a resulting table that looks like:
+-----+--------+------+-----+-----+-----+-----+-------+
| A | B | C | g | h | ... | z | count |
+-----+--------+------+-----+-----+-----+-----+-------+
| cat | dog | bird | xxx | xxx | | xxx | 23 |
| cat | pigeon | goat | xxx | xxx | | xxx | 78 |
+-----+--------+------+-----+-----+-----+-----+-------+
I would normally use a GROUP BY, but I don't want to have to repeat all of the column names (g, h, i, ... z).
I can currently get the result I want using a window function combined with DISTINCT ON, but the query is very slow to run (500k+ records), and has a lot of duplication
WITH temp AS (
SELECT a, b, c, COUNT(*)
FROM my_table
GROUP BY a, b, C
)
SELECT DISTINCT ON (a, b, c) *, (
SELECT count
FROM temp
WHERE
temp.a = t.a
AND temp.b = t.b
AND temp.c = t.c
) as count
FROM my_table as t
ORDER BY a, b, c, x, y;
Is there a way to somehow get the count of the rows that were elimated with DISTINCT in a more efficient manner? Something like
SELECT DISTINCT ON (a, b, c)
*, COUNT(*)
FROM my_table
ORDER BY a, b, c, count;
Or am I taking the wrong approach to begin with?
Use COUNT()
with PARTITION BY
:
SELECT DISTINCT ON (a, b, c) *, COUNT(*) OVER (PARTITION BY a, b, c)
FROM my_table
You should probably also add an ORDER to your query if you care at all about the rest of the fields, otherwise the rows used to get the data displayed in those fields may be inconsistent.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.