为什么在查询中添加WHERE语句（在具有索引的列上）会使我的运行时间从几秒钟增加到几分钟？

Question

My problem is with this query in MySQL: 我的问题是在MySQL中使用以下查询：

select 
    SUM(OrderThreshold < @LOW_COST) as LOW_COUNT,
    SUM(OrderThreshold > @HIGH_COST) as HIGH_COUNT
FROM parts
-- where parttypeid = 1

When the where is uncommented, my run time jumps for 4.5 seconds to 341 seconds. 当不注释where ，我的运行时间跳了4.5秒到341秒。 There are approximately 21M total records in this table. 该表中共有大约2100万条记录。

My EXPLAIN looks like this, which seems to indicate that it is utilizing the INDEX I have on PartTypeId . 我的EXPLAIN看起来像这样，似乎表明它正在利用我在PartTypeId拥有的INDEX。

id  select_type table   type    possible_keys   key         key_len ref rows    Extra
1   SIMPLE      parts   ref     PartTypeId      PartTypeId  1       const       11090057

I created my table using this query: 我使用以下查询创建了表格：

CREATE TABLE IF NOT EXISTS parts (
    Id INTEGER NOT NULL PRIMARY KEY, 
    PartTypeId TINYINT NOT NULL, 
    OrderThreshold INTEGER NOT NULL, 
    PartName VARCHAR(500), 
    INDEX(Id),
    INDEX(PartTypeId),
    INDEX(OrderThreshold),
);

The query with out the WHERE returns 不带WHERE的查询返回

LOW_COUNT   HIGH_COUNT
3570        3584

With the where the results look like this: 随着where的结果是这样的：

LOW_COUNT   HIGH_COUNT
2791        2147

How can I improve the performance of my query to keep the run time down in the seconds (instead of minutes) range when adding a where statement that only looks at one column? 当添加仅查看一列的where语句时，如何提高查询性能以将运行时间保持在几秒钟（而不是几分钟）范围内？

Answer 1

Try 尝试

select SUM(OrderThreshold < @LOW_COST) as LOW_COUNT,
       SUM(OrderThreshold > @HIGH_COST) as HIGH_COUNT
from parts 
where parttypeid = 1
and OrderThreshold not between @LOW_COST and @HIGH_COST

and 和

select count(*) as LOW_COUNT, null as HIGH_COUNT
from parts 
where parttypeid = 1
and OrderThreshold < @LOW_COST
union all
select null, count(*) 
from parts 
where parttypeid = 1
and OrderThreshold > @HIGH_COST

Answer 2

Your accepted answer doesn't explain what is going wrong with your original query: 您接受的答案无法说明原始查询出了什么问题：

select SUM(OrderThreshold < @LOW_COST) as LOW_COUNT,
       SUM(OrderThreshold > @HIGH_COST) as HIGH_COUNT
from parts
where parttypeid = 1;

The index is being used to find the results, but there are a lot of rows with parttypeid = 1 . 该索引用于查找结果，但是有很多行的parttypeid = 1 。 I am guessing that each data page probably has at least one such row. 我猜每个数据页可能至少有一个这样的行。 That means that all the rows are being fetched, but they are being read out-of-order. 这意味着所有行都已被获取，但是它们却是乱序读取的。 That is slower than just doing a full table scan (as in the first query). 这比仅进行全表扫描（如第一个查询）要慢。 In other words, all the data pages are being read, but the index is adding additional overhead. 换句话说，正在读取所有数据页，但是索引增加了额外的开销。

As Juergen points out, a better form of the query moves the conditions into the where clause: 正如Juergen指出的那样，更好的查询形式将条件移到where子句中：

select SUM(OrderThreshold < @LOW_COST) as LOW_COUNT,
       SUM(OrderThreshold > @HIGH_COST) as HIGH_COUNT
from parts
where parttypeid = 1 AND
      (OrderThreshold < @LOW_COST OR OrderThreshold > @HIGH_COST)

(I prefer this form, because the where conditions match the case conditions.) For this query, you want an index on parts(parttypeid, OrderThreshold) . （我喜欢这种形式，因为where条件与case条件匹配。）对于此查询，您需要在parts(parttypeid, OrderThreshold)上建立索引。 I'm not sure about the MySQL optimizer in this case, but it might be better to write as: 在这种情况下，我不确定MySQL优化器，但写为：

select 'Low' as which, count(*) as CNT
from parts
where parttypeid = 1 AND
      OrderThreshold < @LOW_COST
union all
select 'High', count(*) as CNT
from parts
where parttypeid = 1 AND
      OrderThreshold > @HIGH_COST;

Each subquery should definitely use the index in this case. 在这种情况下，每个子查询绝对应使用索引。 (If you want them in one row with two columns, there are a couple ways to achieve that, but I'm guessing that is not so important.) （如果您希望它们以两列的形式排成一行，则有两种方法可以实现这一点，但我想那并不是那么重要。）

Unfortunately, the best index for your query without the where clause is parts(OrderThreshold) . 不幸的是，没有where子句的查询的最佳索引是parts(OrderThreshold) 。 This is a different index from the above. 这是与上述索引不同的索引。

为什么在查询中添加WHERE语句（在具有索引的列上）会使我的运行时间从几秒钟增加到几分钟？

问题描述

2 个解决方案

解决方案1
4 2015-04-11 20:20:44

解决方案2
2 已采纳 2015-04-12 01:51:30

为什么在查询中添加WHERE语句（在具有索引的列上）会使我的运行时间从几秒钟增加到几分钟？

问题描述

2 个解决方案

解决方案1 4 2015-04-11 20:20:44

解决方案2 2 已采纳 2015-04-12 01:51:30

解决方案1
4 2015-04-11 20:20:44

解决方案2
2 已采纳 2015-04-12 01:51:30