Right now it is taking a long long time to run.
The query is:
select count(id), variety_id, name
from tblItem
where order_id IN (
select order_id
from tblItem
where variety_id=4005
order by order_id DESC)
AND variety_id != 4005
GROUP BY variety_id
order by count(id) DESC
LIMIT 5;
I have indexes on variety_id and order_id. I'm basically trying to build a recommendation engine. The query is looking for the top 5 items people buy when they also bought variety_id 4005. But like i said it takes way to long to run.
Does anyone have a way to optimize this query?
Try this:
select count(t1.id), t1.variety_id, t1.name
from tblItem t1
inner join tblItem t2 ON t2.order_id = t1.order_id and t2.variety_id = 4005
where t1.variety_id != 4005
GROUP BY t1.variety_id, t1.name
ORDER BY count(t1.id) DESC
LIMIT 5;
I've often found that MySQL optimizes WHERE ... IN (SELECT ...)
poorly, and JOIN
works better; I've read that recent MySQL versions are better, so it may be version-dependent. Also, you should use COUNT(*)
unless the column can be NULL
and you need to ignore the null values in the count.
SELECT COUNT(*) count, variety_id, name
FROM tblItem AS t1
JOIN (SELECT DISTINCT order_id
FROM tblItem
WHERE variety_id = 4005) AS t2
ON t1.order_id = t2.order_id
WHERE t1.variety_id != 4005
GROUP BY variety_id
ORDER BY count DESC
LIMIT 5
The subquery with DISTINCT
is needed to prevent multiplying the counts by the number of matching rows in the cross-product.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.