I have a Distinct Select statement with multiple left joins that is performing poorly when my where clause is large. Below is my statement
SELECT DISTINCT u.*, ri.id as reg_id, d.id as dist_id
FROM users u
LEFT JOIN earned_points ep ON u.id = ep.user_id
LEFT JOIN distributors d ON d.id = ep.distributor_id
OR d.id = u.distributor_id
OR d.id = u.additional_distributor_id
LEFT JOIN registration_items_users riu ON u.id = riu.user_id
AND riu.distributor_id = d.id
AND riu.registration_item_id = 21
LEFT JOIN registration_items ri ON riu.registration_item_id = ri.id
WHERE d.id IN (201,281,321,631,901,971,1211,1601,1611,1621,
1631,1641,1651,1661,1671,1681,1691,1701,1711,1721,1731,
1741,1751,1761,1771,1781,2281,2291,2401,2781,2801,2931 );
The Explain for this query is below:
This query take around 4 seconds to complete. If I reduce the where down to one id then it speeds up to about 170ms.
Would appreciate any suggestion on how to make this query quicker.
Thank you
I was able to come up with a solution based on Rick James(accepted answer) suggestion. using Union and getting rid of the Left Joins and Distinct did the trick. This new query take around 200ms compared to the 4 second version above.
(SELECT u.*,
(SELECT riu.registration_item_id
FROM registration_items_users riu
WHERE riu.user_id = u.id
AND riu.distributor_id = d.id
AND riu.registration_item_id = 21) as reg_id,
d.id as dist_id
FROM users u
JOIN earned_points ep ON u.id = ep.user_id
JOIN distributors d ON d.id = ep.distributor_id
WHERE d.id IN (201,281,321,631,901,971,1211,1601,1611,1621,
1631,1641,1651,1661,1671,1681,1691,1701,1711,1721,1731,
1741,1751,1761,1771,1781,2281,2291,2401,2781,2801,2931))
UNION
(SELECT u.*,
(SELECT riu.registration_item_id
FROM registration_items_users riu
WHERE riu.user_id = u.id
AND riu.distributor_id = d.id
AND riu.registration_item_id = 21) as reg_id,
d.id as dist_id
FROM users u
JOIN distributors d ON d.id = u.distributor_id
WHERE d.id IN (201,281,321,631,901,971,1211,1601,1611,1621,
1631,1641,1651,1661,1671,1681,1691,1701,1711,1721,1731,
1741,1751,1761,1771,1781,2281,2291,2401,2781,2801,2931))
UNION
(SELECT u.*,
(SELECT riu.registration_item_id
FROM registration_items_users riu
WHERE riu.user_id = u.id
AND riu.distributor_id = d.id
AND riu.registration_item_id = 21) as reg_id,
d.id as dist_id
FROM users u
JOIN distributors d ON d.id = u.additional_distributor_id
WHERE d.id IN (201,281,321,631,901,971,1211,1601,1611,1621,
1631,1641,1651,1661,1671,1681,1691,1701,1711,1721,1731,
1741,1751,1761,1771,1781,2281,2291,2401,2781,2801,2931))
In the EXPLAIN
, look at the u
line. It is doing a "table scan" of about 6974 rows.
Get rid of LEFT
unless the "right" table is optional.
Turn the OR
into a UNION
; that is where the indexes are failing you. ( UNION ALL
is faster than UNION DISTINCT
; pick whichever one make sense.)
Assuming the LEFTs
can be removed, and assuming the DISTINCT
can be moved from SELECT
to UNION
:
SELECT u.*, ri.id as reg_id, d.id as dist_id
FROM users u
JOIN earned_points ep ON u.id = ep.user_id -- ep needed only for this
JOIN distributors d ON d.id = ep.distributor_id -- This one line differs
JOIN registration_items_users riu ON u.id = riu.user_id
AND riu.distributor_id = d.id
AND riu.registration_item_id = 21
JOIN registration_items ri ON riu.registration_item_id = ri.id
WHERE d.id IN (201,281,321,631,901,971,1211,1601,1611,1621,
1631,1641,1651,1661,1671,1681,1691,1701,1711,1721,1731,
1741,1751,1761,1771,1781,2281,2291,2401,2781,2801,2931
)
UNION DISTINCT
SELECT u.*, ri.id as reg_id, d.id as dist_id
FROM users u
JOIN distributors d ON d.id = u.distributor_id
JOIN registration_items_users riu ON u.id = riu.user_id
AND riu.distributor_id = d.id
AND riu.registration_item_id = 21
JOIN registration_items ri ON riu.registration_item_id = ri.id
WHERE d.id IN (201,281,321,631,901,971,1211,1601,1611,1621,
1631,1641,1651,1661,1671,1681,1691,1701,1711,1721,1731,
1741,1751,1761,1771,1781,2281,2291,2401,2781,2801,2931
)
UNION DISTINCT
SELECT u.*, ri.id as reg_id, d.id as dist_id
FROM users u
JOIN distributors d ON d.id = u.additional_distributor_id
JOIN registration_items_users riu ON u.id = riu.user_id
AND riu.distributor_id = d.id
AND riu.registration_item_id = 21
JOIN registration_items ri ON riu.registration_item_id = ri.id
WHERE d.id IN (201,281,321,631,901,971,1211,1601,1611,1621,
1631,1641,1651,1661,1671,1681,1691,1701,1711,1721,1731,
1741,1751,1761,1771,1781,2281,2291,2401,2781,2801,2931
) ;
It is generally a bad idea to splay an array across columns. That seems to be what is going on with distributors
. And this mess may be a result of such.
Edit
Even better would be to pull the ri
and rui
stuff out of the selects and turn it into a subquery. Here's the gist; I don't have the energy to write it all:
SELECT x.*,
( SELECT ... ri and rui stuff ... ) AS reg_id
FROM (
-- from above, less the ri and rui stuff:
SELECT ...
UNION DISTINCT
SELECT ...
UNION DISTINCT
SELECT ...
) AS x;
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.