简体   繁体   English

为什么这个索引不起作用(Mysql)

[英]Why doesn't this index work (Mysql)

I have this table: 我有这张桌子:

CREATE TABLE  `maindb`.`daily_info` (
  `di_date` date NOT NULL,
  `di_sid` int(10) unsigned NOT NULL default '0',
  `di_type` int(10) unsigned NOT NULL default '0',
  `di_name` varchar(20) NOT NULL default '',
  `di_num` int(10) unsigned NOT NULL default '0',
  `di_abt` varchar(1) NOT NULL default 'a',
  PRIMARY KEY  (`di_date`,`di_sid`,`di_type`,`di_name`,`di_abt`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

When I use this query: 当我使用此查询时:

explain
SELECT  MONTH(di_date) as label1, DAYOFMONTH(di_date) as label2, sum(di_num) as count , di_abt as abt
FROM   `daily_info`
WHERE  di_sid=6
       AND di_type = 4
       AND di_name='clk-1'
       AND di_date > '2009-10-01' AND di_date < '2009-10-16'
GROUP BY
       DAYOFMONTH(di_date)
ORDER BY
       TO_DAYS(di_date) DESC

I get: 我明白了:

1, 'SIMPLE', 'daily_info', 'range', 'PRIMARY', 'PRIMARY', '3', '', 2500, 'Using where; Using temporary; Using filesort'

When actually if the key worked and the query would be filtered by di_date, di_sid and di_type, it would need to search only a few dozen rows. 实际上,如果密钥有效并且查询将被di_date,di_sid和di_type过滤,则只需要搜索几十行。

What is wrong with the index (or query?) 索引(或查询?)有什么问题?

Thanks! 谢谢!

You use the range condition on the first index column which kills possibility to filter on other columns. 您在第一个索引列上使用范围条件,这可能会导致在其他列上进行筛选。

There is no single contiguous range in this index which would contain those and only those records that satisfy the condition. 此索引中没有单个连续范围包含那些且仅包含满足条件的那些记录。

MySQL is not able to do SKIP SCAN which would jump over the distinct values of di_date . MySQL无法进行SKIP SCAN ,跳过di_date的不同值。 That's why it does it's best: uses range access to filter on di_date and uses WHERE to filter on all other fields. 这就是为什么它做得最好:使用range访问来过滤di_date并使用WHERE过滤所有其他字段。

Either recreate the index as this (the best decision): 要么重新创建索引(最好的决定):

PRIMARY KEY  (`di_sid`,`di_type`,`di_name`,`di_date`,`di_abt`)

or, if you're unable to recreate the index, you can emulate the SKIP SCAN : 或者,如果您无法重新创建索引,则可以模拟SKIP SCAN

SELECT  MONTH(di.di_date) as label1, DAYOFMONTH(di.di_date) as label2, sum(di.di_num) as count , di.di_abt as abt
FROM    (
        SELECT  DISTINCT di_date
        FROM    daily_info
        WHERE   di_date > '2009-10-01' AND di_date < '2009-10-16'
        ) do
JOIN    daily_info di
ON      di.di_date <= do.di_date
        AND di.di_date>= do.di_date
        AND di_sid = 6
        AND di_type = 4
        AND di_name = 'clk-1'
GROUP BY
        DAYOFMONTH(di.di_date)
ORDER BY
        TO_DAYS(di.di_date) DESC

Make sure that Using index for group-by and Range checked for each record are present in the plan. 确保计划中存在“ Using index for group-by和“ Range checked for each record ”。

This condition: 这个条件:

di.date <= do.date
AND di.date >= do.date

is used instead of simple di.date = do.date to force the range checking. 使用而不是简单的di.date = do.date来强制范围检查。

See this article in my blog for more detailed explanation of emulating SKIP SCAN: 有关模拟SKIP SCAN的更多详细说明,请参阅我的博客中的这篇文章:

Update: 更新:

The latter query actually uses an equijoin and MySQL optimizes it without the tricks. 后一个查询实际上使用了equijoin,并且MySQL在没有技巧的情况下优化它。

The trick above applies only to the ranged queries, ie when the innermost loop should use the range access, not the ref access. 上面的技巧仅适用于远程查询,即最内层循环应使用range访问,而不是ref访问。

It would be useful if you had to do something like di_name <= 'clk-1' 如果你不得不做像di_name <= 'clk-1'那样的事情会很有用

This query should work fine: 此查询应该可以正常工作:

SELECT  MONTH(di.di_date) as label1, DAYOFMONTH(di.di_date) as label2, sum(di.di_num) as count , di.di_abt as abt
FROM    (
        SELECT  DISTINCT di_date
        FROM    daily_info
        WHERE   di_date > '2009-10-01' AND di_date < '2009-10-16'
        ) do
JOIN    daily_info di
ON      di.di_date = do.di_date
        AND di_sid = 6
        AND di_type = 4
        AND di_name = 'clk-1'
GROUP BY
        DAYOFMONTH(di.di_date)
ORDER BY
        TO_DAYS(di.di_date) DESC

Make sure that di uses ref access on the whole subkey possible here, with key_len = 33 使用key_len = 33确保di在整个子项上使用ref访问权限

Update 2 更新2

In your query, you are using these expressions out of the GROUP BY : 在您的查询中,您正在GROUP BY中使用这些表达式:

MONTH(di_date)
TO_DAYS(di_date)
di_abt

The query as it is now will sum all values for the 1st , 2nd etc. for any month and year. 现在的查询将对任何月份和年份的1st2nd等的所有值求和。

I. e. I. e。 for the first group it will add up all values from Jan 1st, 2000 , then Feb 1st, 2000 , etc. 对于第一组,它将累计Jan 1st, 2000 Feb 1st, 2000日等所有值。

Then it will return any random value of MONTH , any random value of TO_DAYS and any random value of di_abt from each group. 然后,它会返回的任何随机值MONTH任何随机值TO_DAYS任何随机值di_abt从每个组。

Your condition now is within a single month, so it's OK now, but if your condition will span multiple months (to say nothing of years), they query will produce unexpected results. 你的病情现在是一个月内,所以现在没关系,但是如果你的病情会持续数月(更不用说几年了),他们的查询会产生意想不到的结果。

Do you really want to group by dates? 你真的想按日期分组吗?

You are range-scanning the first part of the index - therefore it cannot use the subsequent parts of the index. 您是范围扫描索引的第一部分 - 因此它不能使用索引的后续部分。

The way to improve this is to create another index with the fields in a different order which is more conducive to this particular query. 改进方法的方法是使用不同顺序的字段创建另一个索引,这更有利于此特定查询。

If your index was di_sid,di_type,di_date then it may be better. 如果你的索引是di_sid,di_type,di_date那么它可能会更好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM