MySQL index on two columns when using select

Question

I have very lame question here, so hope all of you MySQL experts could answer this for me :)

I have this type of table:

+--------+------------------+------+-----+---------+-------+
| Field  | Type             | Null | Key | Default | Extra |
+--------+------------------+------+-----+---------+-------+
| id     | int(10) unsigned | NO   | PRI | NULL    |       |
| abc_1  | char(1)          | NO   | MUL | NULL    |       |
| abc_2  | char(2)          | NO   | MUL | NULL    |       |
| abc_3  | char(3)          | NO   | MUL | NULL    |       |
| abc_4  | char(4)          | NO   | MUL | NULL    |       |
| abc_5  | char(5)          | NO   | MUL | NULL    |       |
| abc_6  | char(6)          | NO   | MUL | NULL    |       |
| abc_7  | char(7)          | NO   | MUL | NULL    |       |
| abc_8  | char(8)          | NO   | MUL | NULL    |       |
| abc_9  | char(9)          | NO   | MUL | NULL    |       |
| abc_10 | char(10)         | NO   | MUL | NULL    |       |
+--------+------------------+------+-----+---------+-------+

There are a lot of (several millions) records on this table.

All the queries looks like:

SELECT `id` FROM `tbl` WHERE `abc_1` = 'a' LIMIT 10;
SELECT `id` FROM `tbl` WHERE `abc_2` = 'zz' LIMIT 10;
SELECT `id` FROM `tbl` WHERE `abc_3` = 'xxx' LIMIT 10;

and so on.

Table has InnoDB engine, the collation of abc columns is latin1_general_ci .

So my question is very simple: which index should I add to make these type of queries run faster?

Only single-column (for ex: abc_1 , abc_2 and so on), two-column (for ex: id AND abc_1 , id AND abc_2 and so on) or two-column in reverse order (for ex: abc_1 AND id , abc_2 AND id ) ?

As I imagine the last variant would be the best ( abc_1 + id ). I could test and benchmark all variants, but since it is big table, it takes a lot of time to create new index, so I wanted to ask your opinion first.

Also maybe someone could suggest any cache techniques how to run these type of queries faster, without involving MySQL directly? I have heard that for this type of queries you can use Sphinx, for ex: adding abc columns as attribute? Maybe somone has experience on that?

Thank you all in advance!

Answer 1

The best indexes for these queries are composite indexes: table(abc_1, id) , table(abc_2, id) , table(abc_3, id) (and so on). I believe this is your last option.

These indexes "cover" your suggested queries. That means that the indexes themselves can be used for the query, rather than loading the data from the data pages.

If you have a mixture of all these queries happening all the time, then you want to be sure that you have enough memory to store the indexes in memory.

Answer 2

I would make id a primary key and then index the abc_* columns individually. Secondary indices in MySQL are stored along with their corresponding primary key value, so in effect they can "see" the primary key. In terms of space optimization and avoiding a lot of redundancy, I'd simply make id a primary key and then index the abc_* columns separately.

In terms of performance, the two biggest levers you have are the buffer pool http://dev.mysql.com/doc/refman/5.6/en/innodb-buffer-pool.html ... specifically the innodb_buffer_pool_size and the innodb_buffer_pool_instances) and the queries you submit themselves. If your abc_* columns are indexed (and id is the primary key) the queries you suggested will be very efficient. But you'd do well to keep an eye on your slow query log, and perhaps install Percona MySQL and use their pt-query-digest tool to analyze your slow queries (eg http://www.percona.com/doc/percona-toolkit/pt-query-digest.html ). The later is much more optional and, given the queries you're suggesting, probably wouldn't be necessary in the first place.

Btw, one thing I'd consider looking at is the length of the abc_* columns. If they are considerably long, you might try applying an MD5 of them, or normalizing your data a bit more and simply storing the ids (with the actual text values in a lookup table). The later is not necessarily necessary (pardon that brilliant sentence) but if I had especially long text values in my abc_* columns, I probably would consider it.

Answer 3

If you have no cross-over - ie values in each column, are unique to each column, for example 'zz' is ONLY ever found in abc_2, never the 3rd column etc. ... then suggest that the Full-Text index in MySQL could work well. You are only doing single word lookups, so will be really fast. Even the new full-text support in innodb. (the inverted index used in a Full-text works really well for single word lookups)

If you do need to limit results to specific columns, then maybe an external search engien would work better. Sphinx as you suggest would work, but most are going to be ok with such simple requirements. A nice feature of an external index, can setup the index without changing your database table, so could setup a sphinx index on the table, without touching your actual table. (In Sphinx you would NOT need the columns as attributes, leave them as fields, and you can still do full-text queries. Uses less memory, and indexing will be much faster. ) ... sphinx would run such queries, pretty much consistently under 1ms, regardless of index size, and on very modest servers.

MySQL index on two columns when using select

Question

3 answers

solution1
0 2014-12-18 16:12:24

solution2
0 2014-12-18 16:22:31

solution3
0 2014-12-19 10:20:35

MySQL index on two columns when using select

Question

3 answers

solution1 0 2014-12-18 16:12:24

solution2 0 2014-12-18 16:22:31

solution3 0 2014-12-19 10:20:35

solution1
0 2014-12-18 16:12:24

solution2
0 2014-12-18 16:22:31

solution3
0 2014-12-19 10:20:35