简体   繁体   English

我应该为同一列创建 2 个索引以加快连接速度吗?

[英]Should I create 2 indexes for the same column to speed up a join?

I am new to database index and I've just read about what an index is, differences between clustered and non clustered and what composite index is.我是数据库索引的新手,我刚刚了解了索引是什么、聚集和非聚集索引之间的区别以及复合索引是什么。

So for a inner join query like this:所以对于像这样的内部连接查询:

SELECT columnA
FROM table1
INNER JOIN table2
ON table1.columnA= table2.columnA;

In order to speed up the join , should I create 2 indexes, one for table1.columnA and the other for table2.columnA , or just creating 1 index for table1 or table2 ?为了加快join速度,我应该创建 2 个索引,一个用于table1.columnA ,另一个用于table2.columnA ,还是只为table1table2创建 1 个索引?

One is good enough?一个就够了? I don't get it, for example, if I select some data from table2 first and based on the result to join on columnA , then I am looping through results one by one from table2 , then an index from table2.columnA is totally useless here, because I don't need to find anything in table2 now.我不明白,例如,如果我 select 首先来自table2的一些数据并根据结果加入columnA ,然后我从table2逐个循环结果,那么来自table2.columnA的索引完全没用在这里,因为我现在不需要在table2中找到任何东西。 So I am needing a index for table1.columnA .所以我需要table1.columnA的索引。

And vice versa, I need a table2.columnA if I select some results from table1 first and want to join on columnA .反之亦然,如果我 select 首先来自table1的一些结果并想加入columnA table2.columnA

Well, I don't know how in reality "select xxxx first then join based on..." looks like, but that scenario just came into my mind.好吧,我不知道实际上“先选择 xxxx,然后基于...加入”看起来如何,但这种情况刚刚出现在我的脑海中。 It would be much appreciated if someone could also give a simple example.如果有人也可以举一个简单的例子,将不胜感激。

One index is sufficient, but the question is which one?一个索引就足够了,但问题是哪一个?

It depends on how the MySQL optimizer decides to order the tables in the join.这取决于 MySQL 优化器如何决定对连接中的表进行排序。

For an inner join, the results are the same for table1 INNER JOIN table2 versus table2 INNER JOIN table1 , so the optimizer may choose to change the order.对于内连接, table1 INNER JOIN table2table2 INNER JOIN table1的结果相同,因此优化器可能会选择更改顺序。 It is not constrained to join the tables in the order you specified them in your query.按照您在查询中指定的顺序连接表不受限制。

The difference from an indexing perspective is whether it will first loop over rows of table1, and do lookups to find matching rows in table2, or vice-versa: loop over rows of table2 and do lookups to find rows in table1.从索引的角度来看,不同之处在于它是否会首先遍历 table1 的行,然后查找 table2 中的匹配行,反之亦然:遍历 table2 的行并查找 table1 中的行。

MySQL does joins as "nested loops". MySQL 确实连接为“嵌套循环”。 It's as if you had written code in your favorite language like this:就好像你用你最喜欢的语言编写了这样的代码:

foreach row in table1 {
  look up rows in table2 matching table1.column_name
}

This lookup will make use of the index in table2.此查找将使用 table2 中的索引。 An index in table1 is not relevant to this example, since your query is scanning every row of table1 anyway. table1 中的索引与此示例无关,因为您的查询无论如何都在扫描 table1 的每一行。

How can you tell which table order is used?你怎么知道使用了哪个表顺序? You can use EXPLAIN .您可以使用EXPLAIN It will show you a row for each table reference in the query, and it will present them in the join order.它将为查询中的每个表引用显示一行,并以连接顺序显示它们。

Keep in mind the presence of an index in either table may influence the optimizer's choice of how to order the tables.请记住,任一表中存在索引可能会影响优化器选择如何对表进行排序。 It will try to pick the table order that results in the least expensive query.它将尝试选择导致查询成本最低的表顺序。

So maybe it doesn't matter which table you add the index to, because whichever one you put the index on will become the second table in the join order, because it makes it more efficient to do the lookup that way.因此,也许您将索引添加到哪个表并不重要,因为无论您将索引放在哪个表上,都会成为连接顺序中的第二个表,因为这样可以更有效地进行查找。 Use EXPLAIN to find out.使用 EXPLAIN 找出答案。

90% of the time in a properly designed relational database, one of the two columns you join together is a primary key, and so should have a clustered index built for it.在设计合理的关系数据库中,90% 的情况下,连接在一起的两列之一是主键,因此应该为其构建聚集索引。

So as long as you're in that case, you don't need to do anything at all.所以只要你在那种情况下,你根本不需要做任何事情。 The only reason to add additional non-clustered indices is if you're also further filtering the join with a where clause at the end of your statement, you need to make sure both the join columns and the filtered columns are in a correct index together (ie correct sort order, etc).添加其他非聚集索引的唯一原因是,如果您还在语句末尾使用where子句进一步过滤连接,则需要确保连接列和过滤列都在正确的索引中(即正确的排序顺序等)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM