简体   繁体   中英

MySQL sorting on joined table column extremely slow (temp table)

I have some tables:

object 
person 
project 
[...] (some more tables) 
type 

The object table has foreign keys to all other tables.

Now I do a query like:

SELECT * FROM object 
LEFT JOIN person ON (object.person_id = person.id) 
LEFT JOIN project ON (object.project_id = project.id)
LEFT JOIN [...] (all other joins)
LEFT JOIN type ON (object.type_id = type.id)
WHERE object.customer_id = XXX 
ORDER BY object.type_id ASC
LIMIT 25

This works perfectly well and fast, even for big resultsets. For example I have 90000 objects and the query takes about 3 seconds. The result ist quite big because the tables have a lot of columns and all of them are fetched. For info: I'm using Symfony with Propel, InnoDB and the "doSelectJoinAll"-function.

But if do a query like (sort by type.name):

SELECT * FROM object 
LEFT JOIN person ON (object.person_id = person.id) 
LEFT JOIN project ON (object.project_id = project.id)
LEFT JOIN [...] (all other joins)
LEFT JOIN type ON (object.type_id = type.id)
WHERE object.customer_id = XXX 
ORDER BY type.name ASC
LIMIT 25

The query takes about 200 seconds!

EXPLAIN:

id  | select_type   | table     | type      | possible_keys | key       | key_len   | ref           | rows      | Extra
1   | SIMPLE    | object    | ref       | object_FI_2   | object_FI_2   | 4     | const         | 164966    | Using where; Using temporary; Using filesort
1   | SIMPLE    | person    | eq_ref    | PRIMARY   | PRIMARY   | 4         | db.object.person_id   | 1     
1   | SIMPLE    | ...       | eq_ref    | PRIMARY   | PRIMARY   | 4         | db.object...._id  | 1     
1   | SIMPLE    | type      | eq_ref    | PRIMARY   | PRIMARY   | 4         | db.object.type_id     | 1     

I saw in the processlist, that MySQL is creating a temporary table for such a sorting on a joined table.

Adding an index to type.name didn't improve the performance. There are only about 800 type rows.

I found out that the many joins and the big result is the problem, because if I do a query with just one join like:

SELECT * FROM object 
LEFT JOIN type ON (object.type_id = type.id)
WHERE object.customer_id = XXX 
ORDER BY type.name ASC
LIMIT 25

it works as fast as expected.

Is there a way to improve such sorting queries on a big resultset with many joined tables? Or is it just a bad habit to sort on a joined table column and this shouldn't be done anyway?

Thank you

LEFT gets in the way of rearranging the order of the tables. How fast is it without any LEFT ? Do you get the same answer?

LEFT may be a red herring... Here's what the optimizer is likely to be doing:

  1. Decide what order to do the tables in. Take into consideration any WHERE filtering and any LEFTs . Because of WHERE object.customer_id = XXX , object is likely to be the best table to start with.
  2. Get the rows from object that satisfy the WHERE .
  3. Get the columns needed from the other tables (do the JOINs ).
  4. Sort according to the ORDER BY ** see below
  5. Deliver the first 25 rows.

** Let's dig deeper into these two:

WHERE object.customer_id = XXX ORDER BY object.id
WHERE object.customer_id = XXX ORDER BY virtually-anything-else

You have INDEX(customer_id) , correct? And the table is InnoDB, correct? Well, each secondary index implicitly includes the PRIMARY KEY , as if you had said INDEX(customer_id, id) . The optimal index for the first WHERE + ORDER BY is precisely that. It will locate XXX and scan 25 rows, then stop. You might say that steps 2,4,5 are blended together.

The second WHERE just gather all the stuff through step 4. This could be thousands of rows. Hence it is likely to be a lot slower.

See also article on building optimal indexes .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM