I am trying to run a SQL query to get four random items. As the table product_filter
has more than one touple in product
i have to use DISTINCT
in SELECT
, so i get this error:
for SELECT DISTINCT, ORDER BY expressions must appear in select list
But if i put RANDOM()
in my SELECT
it will avoid the DISTINCT
result.
Someone know how to use DISTINCT
with the RANDOM()
function? Below is my problematic query.
SELECT DISTINCT
p.id,
p.title
FROM
product_filter pf
JOIN product p ON pf.cod_product = p.cod
JOIN filters f ON pf.cod_filter = f.cod
WHERE
p.visible = TRUE
LIMIT 4
ORDER BY RANDOM();
You either do a subquery
SELECT * FROM (
SELECT DISTINCT p.cod, p.title ... JOIN... WHERE
) ORDER BY RANDOM() LIMIT 4;
or you try GROUPing for those same fields:
SELECT p.cod, p.title, MIN(RANDOM()) AS o FROM ... JOIN ...
WHERE ... GROUP BY p.cod, p.title ORDER BY o LIMIT 4;
Which of the two expressions will evaluate faster depends on table structure and indexing; with proper indexing on cod and title, the subquery version will run faster (cod and title will be taken from index cardinality information, and cod is the only key needed for the JOIN, so if you index by title, cod and visible (used in the WHERE), it is likely that the physical table will not even be accessed at all.
I am not so sure whether this would happen with the second expression too.
You can simplify your query to avoid the problem a priori:
SELECT p.cod, p.title
FROM product p
WHERE p.visible
AND EXISTS (
SELECT 1
FROM product_filter pf
JOIN filters f ON f.cod = pf.cod_filter
WHERE pf.cod_product = p.cod
)
ORDER BY random()
LIMIT 4;
You have only columns from table product
in the result, other tables are only checked for existence of a matching row. For a case like this the EXISTS
semi-join is likely the fastest and simplest solution. Using it does not multiply rows from the base table product
, so you don't need to remove them again with DISTINCT
.
LIMIT
has to come last, after ORDER BY
.
I simplified WHERE p.visible = 't'
to p.visible
, because this should be a boolean column.
Use a subquery. Don't forget the table alias, t
. LIMIT
comes after ORDER BY
.
SELECT *
FROM (SELECT DISTINCT a, b, c
FROM datatable WHERE a = 'hello'
) t
ORDER BY random()
LIMIT 10;
I think you need a subquery:
select *
from (select DISTINCT p.cod, p.title
from product_filter pf join
product p
on pf.cod_product = p.cod
where p.visible = 't'
) t
LIMIT 4
order by RANDOM()
Calculate the distinct values first, and then do the limit.
Do note, this does have performance implications, because this query does a distinct on everything before selecting what you want. Whether this matters depends on the size of your table and how you are using the query.
SELECT DISTINCT U.* FROM
(
SELECT p.cod, p.title FROM product__filter pf
JOIN product p on pf.cod_product = p.cod
JOIN filters f on pf.cod_filter = f.cod
WHERE p.visible = 't'
ORDER BY RANDOM()
) AS U
LIMIT 4
This does the RANDOM first then the LIMIT afterwards.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.