简体   繁体   English

MySQL性能,内连接,如何避免使用临时和filesort

[英]MySQL performance, inner join, how to avoid Using temporary and filesort

I have a table 1 and table 2. 我有一张桌子1和桌子2。

Table 1 PARTNUM - ID_BRAND partnum is the primary key id_brand is "indexed" 表1 PARTNUM - ID_BRAND partnum是主键id_brand是“索引”

Table 2 ID_BRAND - BRAND_NAME id_brand is the primary key brand_name is "indexed" 表2 ID_BRAND - BRAND_NAME id_brand是主键brand_name是“已编入索引”

The table 1 contains 1 million of records and the table 2 contains 1.000 records. 表1包含100万条记录,表2包含1.000条记录。

I'm trying to optimize some query using EXPLAIN and after a lot of try I have reached a dead end. 我正在尝试使用EXPLAIN优化一些查询,经过大量尝试后我已经走到了死胡同。

EXPLAIN 
SELECT pm.partnum, pb.brand_name
FROM products_main AS pm 
LEFT JOIN products_brands AS pb ON pm.id_brand=pb.id_brand
ORDER BY pb.brand ASC 
LIMIT 0, 10

The query returns this execution plan: 查询返回此执行计划:

ID, SELECT_TYPE, TABLE, TYPE, POSSIBLE_KEYS, KEY, KEY_LEN , REF, ROWS, EXTRA
1, SIMPLE, pm, range, PRIMARY, PRIMARY, 1, , 1000000, Using where; Using temporary; Using filesort
1, SIMPLE, pb, ref, PRIMARY, PRIMARY, 4, demo.pm.id_pbrand, 1,

The MySQL query optimizer shows a temporary + filesort in the execution plan. MySQL查询优化器在执行计划中显示临时+文件排序。 How can I avoid this? 我怎么能避免这个?

The "EVIL" is in the ORDER BY pb.brand ASC . “EVIL”位于ORDER BY pb.brand ASC中 Ordering by that external field seems to be the bottleneck.. 订购外部领域似乎是瓶颈。

Try replacing the join with a subquery. 尝试使用子查询替换连接。 MySQL's optimizer kind of sucks; MySQL的优化器很糟糕; subqueries often give better performance than joins. 子查询通常比连接提供更好的性能。

This question is somewhat outdated, but I did find it, and so will other people. 这个问题有些过时,但我确实找到了,其他人也是如此。

Mysql uses temporary if the ORDER BY or GROUP BY contains columns from tables other than the first table in the join queue. 如果ORDER BY或GROUP BY包含连接队列中第一个表以外的表中的列,则Mysql使用临时。

So you just need to have the join order reversed by using STRAIGHT_JOIN, to bypass the order invented by optimizer: 所以你只需要通过使用STRAIGHT_JOIN来反转连接顺序,以绕过优化器发明的顺序:

SELECT STRAIGHT_JOIN pm.partnum, pb.brand_name
FROM products_brands AS pb 
RIGHT JOIN products_main AS pm ON pm.id_brand=pb.id_brand
ORDER BY pb.brand ASC 
LIMIT 0, 10

Also make sure that max_heap_table_size AND tmp_table_size variables are set to a number big enough to store the results: 还要确保将max_heap_table_size和tmp_table_size变量设置为足以存储结果的数字:

SET global tmp_table_size=100000000;
SET global max_heap_table_size=100000000;

-- 100 megabytes in this example. - 在此示例中为100 MB。 These can be set in my.cnf config file, too. 这些也可以在my.cnf配置文件中设置。

First, try changing your index on the products_brands table. 首先,尝试更改products_brands表上的索引。 Delete the existing one on brand_name , and create a new one: 删除brand_name上的现有brand_name ,然后创建一个新名称:

ALTER TABLE products_brands ADD INDEX newIdx (brand_name, id_brand)

Then, the table will already have a "orderedByBrandName" index with the ids you need for the join, and you can try: 然后,该表将具有“orderedByBrandName”索引,其中包含加入所需的ID,您可以尝试:

EXPLAIN
SELECT pb.brand_name, pm.partnum
FROM products_brands AS pb 
  LEFT JOIN products_main AS pm ON pb.id_brand = pm.id_brand
LIMIT 0, 10

Note that I also changed the order of the tables in the query, so you start with the small one. 请注意,我还更改了查询中表的顺序,因此您可以从小表开始。

First of all, I question the use of an outer join seeing as the order by is operating on the rhs, and the NULL's injected by the left join are likely to play havoc with it. 首先,我质疑外部连接的使用,因为order by操作在rhs上,而左连接注入的NULL可能会对它造成严重破坏。

Regardless, the simplest approach to speeding up this query would be a covering index on pb.id_brand and pb.brand. 无论如何,加速此查询的最简单方法是pb.id_brand和pb.brand上的覆盖索引。 This will allow the order by to be evaluated 'using index' with the join condition. 这将允许使用连接条件“使用索引”评估订单。 The alternative is to find some way to reduce the size of the intermediate result passed to the order-by. 另一种方法是找到一些方法来减少传递给order-by的中间结果的大小。

Still, the combination of outer-join, order-by, and limit, leaves me wondering what exactly you are querying for, and if there might not be a better way of expressing the query itself. 尽管如此,外部联接,顺序和限制的组合让我想知道你究竟要查询的是什么,以及是否有更好的表达查询本身的方式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM