简体   繁体   English

如何确定慢查询的原因

[英]How to determine cause of slow queries

I have a query that is taking around 7 seconds. 我的查询大约需要7秒钟。 I break it into two queries which provides the same data as the first, and it takes around 0.01 seconds. 我把它分成两个查询,提供与第一个相同的数据,大约需要0.01秒。 I had thought that I properly indexed it, however, likely had not. 我曾经认为我正确地索引了它,但是,可能没有。 The database currently has very little data in it. 数据库目前的数据非常少。 I am using MySQL 5.5.46. 我正在使用MySQL 5.5.46。 I am using PHP with PDO, however, I don't think that is relevant, and purposely did not tag this question with PHP or PDO. 我正在使用PHP与PDO,但是,我不认为这是相关的,并且故意没有用PHP或PDO标记这个问题。

I am not asking why my specific query is taking so long or how to identify slow queries, but am asking for the generic steps to determine the cause of a slow query. 我不是问为什么我的特定查询花了这么长时间或如何识别慢查询,但我要求通用步骤来确定慢查询的原因。 I expect that EXPLAIN will be used. 我希望使用EXPLAIN What are you looking for in EXPLAIN ? 您在EXPLAIN寻找什么? What other steps could one take? 可以采取哪些其他步骤?

Spencer7593's is a very good place to start, but you aren't going to get a full answer there, or here on StackOverflow. Spencer7593是一个非常好的开始,但你不会在那里得到完整的答案,或者在StackOverflow上。 A partial explanation took me about 40 pages full . 部分解释我花了大约40页

EXPLAIN is useful - but needs to be read with an understanding of the structure of the tables and indexes - from your description, it seems likely that the optimizer is ignoring an index. EXPLAIN很有用 - 但需要在了解表和索引的结构的情况下阅读 - 从您的描述中看,优化器似乎忽略了索引。 You can force the DB to use a particular index for a query, but its a rather untidy solution (even if you know that's the best solution today, it might not be in future). 您可以强制数据库为查询使用特定索引 ,但它是一个相当凌乱的解决方案(即使您知道这是今天最好的解决方案,也可能不会在将来)。

If you have a perfectly good index and the DBMS is not using it, then the most likely cause is that the cardinality stats have not been updated - but it can also occur when the data is very skewed (eg if you have 10000 values of 'A' and 2 of 'B' then an index will help you find records with 'B' but not records with 'A'). 如果你有一个非常好的索引并且DBMS没有使用它,那么最可能的原因是基数统计数据尚未更新 - 但是当数据非常偏斜时也会发生(例如,如果你有10000个' A'和2''B'然后索引将帮助您查找带有'B'但不带'A'记录的记录。

Always using an index does not always make your queries faster - sequential reads from a single file are much faster than random reads on 2 files. 始终使用索引并不总能使查询更快 - 从单个文件顺序读取比在2个文件上随机读取要快得多。

Another caveat is that MySQL does not handle push predicates very well. 另一个警告是MySQL不能很好地处理推送谓词。

Beware of implicit (and explicit) type conversions in Joins - MySQL can't use indexes for these. 注意联接中的隐式(和显式)类型转换 - MySQL不能使用索引。 Mariadb supports virtual columns (which can be indexed). Mariadb支持虚拟列(可以编制索引)。 Hence if you 因此,如果你

...
tab_a INNER JOIN tab_b
ON UNIX_TIMESTAMP(tab_a.datetime)=tab_b.seconds_since_epoch

the optimizer can use a index on tab_b.seconds_since_epoch, but not one on tab_a.datetime. 优化器可以使用tab_b.seconds_since_epoch上的索引,但不能使用tab_a.datetime上的索引。

With some engines (and with named locks) queries can be blocked by other activity in the DBMS - although such cases usually manifest from stats based analysis of DBMS performance, and is unlikely to be the cause here. 对于某些引擎(以及命名锁),DBMS中的其他活动可以阻止查询 - 尽管这种情况通常表现为基于统计数据的DBMS性能分析,并且不太可能是此处的原因。 There's another step required to track down what's doing the blocking. 需要另外一步来追踪阻塞的内容。

Decomposing the query into smaller parts and testing them independently is an excellent diagnostic tool (kudos!) but its only when you look at all the EXPLAIN plans that you can understand why you get aberrant behaviour in the composite. 将查询分解为更小的部分并独立测试它们是一个很好的诊断工具(kudos!),但只有当你查看所有EXPLAIN计划时,你才能理解为什么你会在复合材料中出现异常行为。

For most of all, if you have a possibility to use phpmyadmin, there is a great profiler. 最重要的是,如果你有可能使用phpmyadmin,那么有一个很棒的分析器。

After you invoke your query in phpmyadmin, you have option to use "profiling" (right before edit anchors) 在phpmyadmin中调用查询后,您可以选择使用“分析”(在编辑锚点之前)

It gives you a nice graph and table with jobs and timings, so I think it would be helpfull. 它为您提供了一个很好的图表和工作表和时间表,所以我认为这将是有帮助的。

Well this is very generic but I will try to provide some guidelines 这是非常通用的,但我会尝试提供一些指导

  • Number one is Index if you perform a search over a field you need an index for that field. 排名第一的是Index ,如果你执行了你需要该领域中的索引字段的搜索。
  • Now if you perform index over multiple fields instead of multiple index you probably need a composite index instead. 现在,如果您对多个字段而不是多个索引执行索引,则可能需要使用复合索引。
    • Filter a subquery doesnt use index so be carefull if you try to filter over a subquery. 过滤子查询不使用索引,因此如果您尝试过滤子查询,请小心。
    • Also using function on the WHERE doesnt use index, things like SUBSTRING , UPPER CASE or LIKE WHERE上使用函数也不使用索引,比如SUBSTRINGUPPER CASELIKE
  • Using INNER JOIN without ON will cause a CROSS JOIN and multiply the number of rows very fast. 在没有ON情况下使用INNER JOIN将导致CROSS JOIN并非常快速地乘以行数。

In the Query Execution Plan you try to look for FULL SEQ SCAN instead of INDEX SCAN Query Execution Plan您尝试查找FULL SEQ SCAN而不是INDEX SCAN

Explain shows subqueries, which indexes are actually used, how many rows it has to scan etc. See mysql manual on its output 解释显示子查询,实际使用的索引,扫描的行数等。请参阅mysql手册中的输出

Then there goes a magical "staring on it" method, which usually produces ideas on how query complexity can be reduced: 然后是一个神奇的“盯着它”方法,它通常会产生关于如何减少查询复杂性的想法:

  • less queries is better 更少的查询更好
  • indexes are better than full table scan 索引优于全表扫描
  • joins can be better than subqueries 连接可以比子查询更好
  • less joins are better (because joins increase scanned rows, sometimes in multiple times) 较少的连接更好(因为连接增加扫描的行,有时多次)
  • more selective indexes are better than less-selective, so that after indexing less rows remain to scan 更具选择性的索引优于选择性较低的索引,因此在索引之后,较少的行仍然可以扫描
  • grouping and sorting are additional cost 分组和分类是额外的成本
  • having can be more expensive than where (because works after grouping) having可以比更昂贵的where (因为分组后的作品)

and so on 等等

Steps to optimise the run-time of a query. 优化查询运行时的步骤。 You should check the speed of the query after each step - if you could make any changes to it in a particular step.: 您应该在每个步骤后检查查询的速度 - 如果您可以在特定步骤中对其进行任何更改:

  1. Take a look at your query in general and try to confirm that it only queries what it is supposed to query. 一般来看看你的查询,并尝试确认它只查询它应该查询的内容。 Look for unused fields, unnecessary joins, unncessary outer joins. 查找未使用的字段,不必要的连接,不必要的外连接。 Consider using limit to limit the number of records returned. 考虑使用limit来限制返回的记录数。 Remember, assembling a larger resultset than needed also requires additional time to create and send over to the client. 请记住,组装比需要的更大的结果集还需要额外的时间来创建和发送给客户端。
  2. Now, take a closer look at your query again and see if you can simplify it. 现在,再次仔细查看您的查询,看看是否可以简化它。 For example you may have subqueries in the select list, which you can try to convert into derived tables. 例如,您可以在选择列表中包含子查询,您可以尝试将其转换为派生表。 Also take a look at your where criteria and confirm if they can use indexes (expressions, like '%xxx%'. If they cannot, then check if you can change them into sg that can use an index. 另请查看您的where条件,并确认它们是否可以使用索引(表达式,如'%xxx%'。如果不能,请检查是否可以将它们更改为可以使用索引的sg。
  3. If you have any indexes on any affected tables, it is time to refresh them using analyse table command just to be on the safe side of things. 如果您在任何受影响的表上有任何索引,则可以使用analyse table命令刷新它们,只是为了安全起见。 Check the cardinality of the existing indexes. 检查现有索引的基数。 Mysql is less likely to use indexes with low cardinality. Mysql不太可能使用低基数的索引。 If the cardinality is far from what you think it should be (number of unique values in a given field), then you may want to tweak how mysql samples the data to calculate cardinality. 如果基数远远超出您的想象(给定字段中唯一值的数量),那么您可能需要调整mysql如何对数据进行采样以计算基数。
  4. Run explain and check if 运行说明并检查是否
    • indexes are used where you expect them (possible keys, used keys) 索引用于您期望的位置(可能的键,使用的键)
    • avoid join type ALL and file sort 避免连接类型ALL和文件排序

Try adding indexes to those parts of the query where they are not used or if you belive you already have indexes in place, then use index hints, such as force index to make mysql use your index. 尝试将索引添加到查询中未使用它们的那些部分,或者如果您认为已经有索引,则使用索引提示(例如force index使mysql使用您的索引。

  1. If the query is still slow, then you may have to tweak server side variables, use different table engine, partition a table, change data structure (denormalisation), archive old data to reduce size, etc. 如果查询仍然很慢,那么您可能必须调整服务器端变量,使用不同的表引擎,分区表,更改数据结构(非规范化),归档旧数据以减小大小等。

You could write long articles for each individual step, or in case of no. 您可以为每个步骤编写长文章,如果没有,则可以。 5, about each item. 5,关于每个项目。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM