简体   繁体   English

为什么MySQL不使用我的分区作为索引?

[英]Why doesn't MySQL use my partitions as indices?

I created a table partitioned on a numeric ID: 我创建了一个以数字ID分区的表:

CREATE TABLE mytable (
...
`id` int(11) DEFAULT NULL
...
) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY HASH (`id`) PARTITIONS 100

I have no primary key, but a number of indices. 我没有主键,但有许多索引。 I don't have any data in my table where id is less than 0 or greater than 30 (at the moment, I expect this to grow). 我的表中没有任何数据,其中id小于0或大于30(此时,我希望这会增长)。 Most of my queries first include the id to reduce the search space. 我的大多数查询首先包含减少搜索空间的ID。

I figured a query to select distinct(id) from mytable would then just return the number of partitions that had data in it. 我想一个查询select distinct(id) from mytable然后只返回其中包含数据的分区数。 I was surprised that an explain on this instead does a full scan of the data: 我很惊讶,对此的解释反而完全扫描了数据:

explain partitions select distinct(id) from mytable;

|  1 | SIMPLE      | mytable | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p20,p21,p22,p23,p24,p25,p26,p27,p28,p29,p30,p31,p32,p33,p34,p35,p36,p37,p38,p39,p40,p41,p42,p43,p44,p45,p46,p47,p48,p49,p50,p51,p52,p53,p54,p55,p56,p57,p58,p59,p60,p61,p62,p63,p64,p65,p66,p67,p68,p69,p70,p71,p72,p73,p74,p75,p76,p77,p78,p79,p80,p81,p82,p83,p84,p85,p86,p87,p88,p89,p90,p91,p92,p93,p94,p95,p96,p97,p98,p99 | ALL  | NULL          | NULL | NULL    | NULL | 24667132 | Using temporary |

explain select distinct(id) from mytable;
+----+-------------+----------------------+------+---------------+------+---------+------+----------+-----------------+
| id | select_type | table                | type | possible_keys | key  | key_len | ref  | rows     | Extra           |
+----+-------------+----------------------+------+---------------+------+---------+------+----------+-----------------+
|  1 | SIMPLE      | mytable              | ALL  | NULL          | NULL | NULL    | NULL | 24667132 | Using temporary |
+----+-------------+----------------------+------+---------------+------+---------+------+----------+-----------------+

I then read this stackoverflow answer which enlightened how MySQL's partition hash() function works. 然后我读了这个stackoverflow的答案,它启发了MySQL的分区hash()函数的工作原理。

My question is, how can I get MySQL to map each id in the table into its own partition such that selects with the id narrow the search to a single table (and a select distinct() just has to count the number of partitions and not scan them)? 我的问题是,如何让MySQL将表中的每个id映射到自己的分区,以便选择id将搜索范围缩小到单个表(而select distinct()只需要计算分区数而不是扫描他们)?

I'm using Server version: 5.5.35-0ubuntu0.12.04.2 (Ubuntu) . 我正在使用Server version: 5.5.35-0ubuntu0.12.04.2 (Ubuntu)

First off, your conflating two different things. 首先,你将两种不同的东西混为一谈。 One is the fact that a SELECT WHERE id = ? 一个是SELECT WHERE id = ? should only search one partition. 应该只搜索一个分区。 Something which you mentioned but didn't specify whether it currently works or not (given your table definition, I don't see why it shouldn't). 你提到但没有指明它当前是否有效的东西(根据你的表定义,我不明白为什么它不应该)。

The second thing, having a SELECT distinct(id) to only touch the partitioning information, is very different from this. 第二件事, SELECT distinct(id)只触及分区信息,与此截然不同。 However, if I understand you correctly, you're assuming that one partition only has one kind of id . 但是,如果我理解正确,你假设一个分区只有一种id That is not how HASH partitioning works, though. 但这并不是HASH分区的工作原理。 It works similar to a traditional hash-table, by mapping a large key space to a small one, in your case, 100. So each partition will have many possible IDs. 它的工作方式类似于传统的哈希表,通过将大的密钥空间映射到一个小的密钥空间(在您的情况下为100)。因此每个分区都有许多可能的ID。 Since mysql will not keep track which of the possible IDs are really in one partition all it can do is to scan each partition, do the DISTINCT , and give back the result. 由于mysql不会跟踪哪个可能的ID确实存在于一个分区中,所以它只能扫描每个分区,执行DISTINCT并返回结果。 That said, it could to do the DISTINCT operation on the individual partitions instead of the whole table and it could do this in parallel, however, the explain seems to imply that it will create one big temporary to do the DISTINCT , likely because this optimization hasn't been implemented yet. 也就是说,它可以在各个分区而不是整个表上执行DISTINCT操作,它可以并行执行此操作,但是,解释似乎暗示它将创建一个大的临时来执行DISTINCT ,可能是因为这个优化尚未实施。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM