简体   繁体   中英

MySQL index in between where clause and order by clause

My table structure is something like below:

CREATE TABLE test (
    id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,

    field_1 VARCHAR(60) NOT NULL,
    field_2 INT(10) UNSIGNED NULL,
    field_3 INT(10) UNSIGNED NULL,
    field_4 INT(10) UNSIGNED NULL,
    field_5 CHAR(2) NULL,
    field_6 INT(10) UNSIGNED NOT NULL,

    rank TINYINT(2) NOT NULL DEFAULT '0',   
    status TINYINT(3) NOT NULL DEFAULT '0',

    PRIMARY KEY (id),
    INDEX (status)

) DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci ENGINE = MyISAM;

On above table the fields rank and status will have integer value between 0-9 and 0-4 respectively.

Currently the table is filled with around 950K data and I am trying to optimize my queries as much as possible.

Basically I need to select fields with some where clause with a descending order on field rank .

For example, below are few sql queries:

SELECT field_1, field_2, field_3 FROM test WHERE field_1 = 'data1' && status IN ('0', '1', '2') ORDER BY rank DESC LIMIT 0, 20;
SELECT field_1, field_2, field_3 FROM test WHERE field_2 = '5' && status IN ('1', '2') ORDER BY rank DESC LIMIT 0, 20;
SELECT field_1, field_2, field_3 FROM test WHERE field_5 = 'US' && status IN ('0', '2') ORDER BY rank DESC LIMIT 0, 20;

On above query ORDER BY rank DESC is very important. So I am quite confuse whether I should add index on single column or multi-column.

Could anyone suggest me the best solution.

Your key problem is that over 950k rows your status column has up to 4 distinct values. On a BTREE index, this will be a real pain to process.

Some more effective indexes to use for the 3 queries mentioned are probably the following

INDEX forQuery1 ( field_1 , status , rank ) USING BTREE,
INDEX forQuery2 ( field_2 , status , rank ) USING BTREE,
INDEX forQuery3 ( field_5 , status , rank ) USING BTREE,

You'll find that the second query especially should benefit however you're still going to have the issue where the variance of the data is very low for the size of dataset and most likely MySQL will fall back to a table scan though your EXPLAIN will probably show the LIMIT to lessen the effect of that. The indexes mentioned should be suitable for determining which rows to return though.

For further information on how MySQL uses indexes take a look through 13.1.13. CREATE INDEX Syntax in particular the section on B-Tree Index Characteristics and the following extracts

If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to find rows. For example, if you have a three-column index on (col1, col2, col3), you have indexed search capabilities on (col1), (col1, col2), and (col1, col2, col3).

MySQL cannot use an index if the columns do not form a leftmost prefix of the index. Suppose that you have the SELECT statements shown here:

and

Sometimes MySQL does not use an index, even if one is available. One circumstance under which this occurs is when the optimizer estimates that using the index would require MySQL to access a very large percentage of the rows in the table. (In this case, a table scan is likely to be much faster because it requires fewer seeks.) However, if such a query uses LIMIT to retrieve only some of the rows, MySQL uses an index anyway, because it can much more quickly find the few rows to return in the result.

As an additional note, you don't need to quote numeric data types and so field_2 = 5 && status IN ( 1 , 2 ) is valid (and in fact I have had some strange issue in the past due to quoting integer datatypes instead of specifying them as numerics)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM