简体   繁体   English

禁止MySQL对查询使用全表扫描

[英]Prohibit MySQL from using full table scan on a query

Is there any way I can prohibit MySQL from performing a full table scan when the result was not found using indexes? 当使用索引找不到结果时,有什么方法可以禁止MySQL执行全表扫描?

For example this query: 例如这个查询:

SELECT *
FROM a
WHERE (X BETWEEN a.B AND a.C) 
ORDER BY a.B DESC 
LIMIT 1;

Is only efficient if X satisfies the condition and there is at least 1 row returned, but if the condition cannot be satisfied by any data in the table, full scan will be performed, which can be very costly. 仅在X满足条件且返回至少1行时才有效,但如果表中的任何数据无法满足条件,则将执行完全扫描,这可能非常昂贵。

I don't want to optimize this particular query, it is just an example. 我不想优化这个特定的查询,它只是一个例子。

EXPLAIN on this query with X in or outside of range: 使用范围内或范围外的X来解析此查询:

id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE a range long_ip  long_ip 8 \N 116183 100.00 Using where

STATUS VARIABLE show much better information. STATUS VARIABLE显示更好的信息。 For X outside of range: 对于超出范围的X:

Handler_read_prev 84181
Key_read_requests 11047

In range: 在范围内:

Handler_read_key 1
Key_read_requests 12

If only there was a way to prevent Handler_read_prev from ever growing past 1. 如果只有一种方法可以阻止Handler_read_prev过去1增长。

UPDATE. UPDATE。 I can't accept my own answer, because it doesn't really answer the question (HANDLER is a great feature, though). 我不能接受我自己的答案,因为它并没有真正回答这个问题(HANDLER是一个很棒的功能)。 It seems to me that there is no general way to prevent MySQL from doing a full scan. 在我看来,没有通用的方法来阻止MySQL进行全面扫描。 Although, simple conditions like key='X' will be considered as "impossible where", more complex things like BETWEEN will not. 虽然像key ='X'这样的简单条件将被视为“不可能的地方”,但像BETWEEN这样的更复杂的事情则不会。

You could write a "fully covered" subquery that only uses data that is available in indexes. 您可以编写一个“完全覆盖”的子查询,该子查询仅使用索引中可用的数据。 Based on the returned primary key, you can look up the rows in the master table. 根据返回的主键,您可以在主表中查找行。

The following query is fully covered by indexes on (id), (B,id), and (C,id): 以下查询完全由(id),(B,id)和(C,id)上的索引覆盖:

select *
from a
where id in (
    select id
    from a 
    where x <= C
    and id in (
        select id
        from a
        where B <= X 
    )
)
limit 1

Each SELECT uses one index: the innermost the index on (B,id); 每个SELECT使用一个索引:最内层的索引(B,id); the middle SELECT uses the index on (C,id), and the outer SELECT uses the primary key. 中间的SELECT使用索引(C,id),外部SELECT使用主键。

Here is what I came up with in the end: 以下是我最终提出的建议:

HANDLER a OPEN;
HANDLER a READ BC <= (X);
HANDLER a CLOSE;

BC is the name of key (B,C). BC是密钥(B,C)的名称。 If we order the table by B DESC, then the result is guranteed to be equal to 如果我们通过B DESC对表进行排序,则保证结果等于

SELECT *
FROM a
WHERE (X BETWEEN a.B AND a.C) 
ORDER BY a.B DESC 
LIMIT 1;

Now if X is not in the range of the table a, we just have to check that aC is greater than X, if it's not, than X is definitely outside of the range and we don't need to look any further. 现在,如果X不在表a的范围内,我们只需要检查aC是否大于X,如果不是,那么X绝对超出范围,我们不需要再看了。

This is not very elegant though, and you will have to resort the table on each insert or update. 虽然这不是很优雅,但您必须在每次插入或更新时使用该表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM