简体   繁体   English

如何优化此MySQL慢(非常慢)查询?

[英]How to optimize this MySQL slow (very slow) query?

I have a 2 gb mysql table with 500k rows and I run the following query on a system with no load. 我有一个具有500k行的2 GB mysql表,并且在没有负载的系统上运行以下查询。

select * from mytable 
where name in ('n1', 'n2', 'n3', 'n4', ... bunch more... ) 
order by salary

It takes a filesort and between 50 and 70 seconds to complete. 完成一个文件排序需要50到70秒。

When removing the order by salary and doing the sorting in the application, the total runtime (including the sorting) cuts to about 25-30 seconds. 当按薪水除去订单并在应用程序中进行排序时,总运行时间(包括排序)减少到大约25-30秒。 But that's still far too much. 但这还太多了。

Any idea how I can speed this up? 知道如何加快速度吗?

Thank you. 谢谢。

Drop the list of names into a temporary table and then do an inner join on the two tables. 将名称列表拖放到临时表中,然后在两个表上进行内部联接。 This way is much faster than combing that entire list for each row. 这种方式比为每一行合并整个列表要快得多。 Here's the pseudocode: 这是伪代码:

create temporary table names
    (name varchar(255));

insert into names values ('n1'),('n2'),...,('nn');

select
    a.*
from
    mytable a
    inner join names b on
        a.name = b.name

Also note that name should have an index on it. 还要注意, name应该有索引。 That makes things go much faster. 这使事情进展更快。 Thanks to Thomas for making this note. 感谢托马斯做这个笔记。

Some ideas: 一些想法:

  • Do you need to be selecting *, can you get away with only selecting a subset? 您是否需要选择*,仅选择一个子集就能摆脱困境吗?
  • If you can get away with selecting a subset, you could add a covering index, that is already sorted by salary 如果您可以选择一个子集,则可以添加一个覆盖指数,该指数已经按薪水排序
  • If everything has the same pattern you could do LIKE('n%') 如果所有内容都具有相同的模式,则可以执行LIKE('n%')

Try selecting the rows you want using a subquery, and then order the results of that subquery. 尝试使用子查询选择所需的行,然后排序该子查询的结果。 See this question . 看到这个问题

And you do have an index on name in mytable , right? 而且您在mytable确实有一个name索引,对吗?

Depending on the data distribution and the amount of rows your WHERE clause matches, you may want to try an index on (salary, name) or even (name, salary) although the latter will most probably not be very useful for that kind of queries. 根据数据分布和WHERE子句匹配的行数,您可能希望尝试使用(薪水,姓名)甚至(姓名,薪水)上的索引尽管后者对于这种查询可能不太有用。

You may also want to increase your sort_buffer_size setting. 您可能还需要增加sort_buffer_size设置。 Test everything seperately and compare the output of EXPLAIN . 分别测试所有内容,然后比较EXPLAIN的输出。

create index xyz on mytable(name(6));

"IN" queries are almost alway inefficient, as they are conceptually processed like this: “ IN”查询几乎总是效率低下,因为它们在概念上是这样处理的:

select * from mytable where name = n1  
or name = n2
or name = n3
...

The index I've given above may mean the query optimizer accesses the rows by index instead of table scan. 我在上面给出的索引可能意味着查询优化器通过索引而不是表扫描来访问行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM