简体   繁体   English

在MySQL中仅对一列建立索引的性能

[英]Performance of indexing a table with just one column in MySQL

Is there any benefit to creating an index on a temporary table containing just a primary key from a materialized query? 在仅包含实例化查询的主键的临时表上创建索引有什么好处?

I want to delete some data from a particular table, as well as other related tables with foreign key references. 我想从特定表以及具有外键引用的其他相关表中删除一些数据。 In order to improve performance, I'm materializing the initial select into a temp table and then joining against it for subsequent deletes. 为了提高性能,我将最初的选择具体化到一个临时表中,然后将其加入以进行后续删除。

The temp table contains only one column -- the primary key from the subquery. 临时表仅包含一列-子查询的主键。 Is there any performance benefit to creating an index on the temp table's id column? 在临时表的id列上创建索引是否对性能有好处? In my testing I saw an improvement of about 2% (more then offset by the overhead of the index creation), but perhaps the dataset available to me to test was not large enough. 在测试中,我看到了大约2%的改进(然后被索引创建的开销所抵消),但是也许可供我测试的数据集不够大。

CREATE TEMPORARY TABLE ids AS (SELECT id FROM tableA WHERE xxx);
DELETE tableB FROM tableB INNER JOIN ids ON tableB.a_id = ids.id;
DELETE tableC FROM tableC INNER JOIN ids ON tableC.a_id = ids.id;
...
DELETE tableA FROM tableA INNER JOIN ids ON tableA.id = ids.id;

Since all rows from ids temporary table will be used to delete rows in tableB (a_id is indexed), is there any performance benefit to creating a primary key / index on the ids temporary table? 由于ids临时表中的所有行都将用于删除tableB中的行(a_id已建立索引),因此在ids临时表上创建主键/索引是否对性能有好处? Is there a better better, completely different way to approach this? 是否有更好的更好的,完全不同的方法来解决此问题?

It entirely depends on what type of queries you run. 这完全取决于您运行的查询类型。 If you only ever run queries that need to read, or return, the entire table or a significant subset of the entire table, then adding an index will only result in decreasing write performance (which it always does). 如果只运行需要读取或返回整个表或整个表的重要子集的查询,则添加索引只会导致写入性能下降(它总是如此)。 If you will often execute queries that can use such an index to reduce the number of disk page I/Os (because you are looking for only one row, or a very small percentage of the rows) in the table, then adding an index will markedly increase the performance of those queries. 如果您经常执行可使用此类索引来减少表中磁盘页面I / O数量的查询(因为您仅查找一行,或者只查找很小的行数),则添加索引将显着提高了这些查询的性能。

Actually, this is one case where a primary key index could be dangerous for performance. 实际上,这是主键索引可能对性能造成危险的一种情况。

The queries that you have essentially have two logical execution paths. 您拥有的查询本质上具有两个逻辑执行路径。 One is to read the "other" table and look up values in ids. 一种是读取“其他”表并查找id中的值。 The second is to read the id table and look up values in the "other" table. 第二种是读取id表并在“ other”表中查找值。 The latter execution plan is the best one, assuming that the ids are much smaller than the other table. 假定id比另一个表小得多,后一种执行计划是最好的。

The problem with the primary key index is that it might confuse the optimizer, by really making the first option seem reasonable. 主键索引的问题在于,通过真正使第一个选项看起来合理,可能会混淆优化器。 If you trust the optimizer, then having the index is no problem. 如果您信任优化器,那么拥有索引是没有问题的。 But it does raise the possibility of confusion. 但这确实增加了混淆的可能性。

Now to confuse matters further, there are cases where having the index would be very beneficial. 现在让事情进一步混乱,在某些情况下拥有索引将非常有益。 This occurs when the ids table is large relative to the other tables -- and these are also quite big. ids表相对于其他表较大时,就会发生这种情况-而且它们也很大。 In this case, you want to do the deletes in "primary key" order for the "other" table. 在这种情况下,您要按“主键”顺序对“其他”表进行删除。 So, reading that table in order and looking up the id makes sense. 因此,按顺序读取该表并查找ID是有意义的。 This would only be the case when most pages have at least two records on them that are to be deleted. 仅当大多数页面上至少有两个要删除的记录时,情况才会如此。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM