简体繁体 English

聚集和非聚集索引大数据插入

[英]Clustered and nonclustered index large data insert

原文 2013-06-08 20:57:13 5 1 sql/ sql-server/ indexing/ sql-server-2012

I had a clustered index on 5 keys (columns).我在 5 个键（列）上有一个聚集索引。 I had an unclustered index on 2 columns.我在 2 列上有一个非聚集索引。 Because I'm inserting 2-3 million rows in one run, I changed the 2 column unclustered index to clustered and changed the 5 column clustered index to 5 column unclustered index.因为我在一次运行中插入了 2-3 百万行，所以我将 2 列非聚集索引更改为聚集索引，并将 5 列聚集索引更改为 5 列非聚集索引。 My question.我的问题。

When making an index clustered (basically delete and recreate the index as clustered ), I don't need the include (any columns) right since this is clustered?当使索引聚集时（基本上删除并重新创建索引作为clustered ），我不需要include （任何列），因为这是聚集的？
Is it generally correct that I switch the less column index into clustered and change the large column clustered index to unclustered?我将较少的列索引切换为聚集并将大列聚集索引更改为非聚集通常是否正确？ In other words, clustered index should be simple and small?换句话说，聚集索引应该是简单的还是小的？
Is there any performance issues if I switch these two indexes?如果我切换这两个索引会不会有任何性能问题？

1 个解决方案

Unless it's a link table, you normally have clustered index on 1 column.除非它是链接表，否则通常在 1 列上有聚集索引。 And a general recommendation is to choose the smallest possible type for clustered index column(surely which fits your requirements).一般建议是为聚集索引列选择尽可能小的类型（当然这符合您的要求）。 Having many columns not only increases size ( each non-clustered index stores value of clustered index [includes clustered index]! ), but also greatly increases chances of external fragmentation and degrading performance even of inserts .拥有许多列不仅会增加大小（每个非聚集索引都存储聚集索引的值 [包括聚集索引]！ ），而且还会大大增加外部碎片和降低性能甚至inserts 。 Thus, my answers to your questions.因此，我对你的问题的回答。

That's correct, clustered index is a table, no need to include any columns没错，聚集索引是一张表，不需要包含任何列
Yes, absolutely是的，一点没错
I'm not sure what if you are asking about performance of switching itself or performance impact of having smaller(or fewer columns) clustered index, so I'll try to answer both.我不确定您是否询问切换本身的性能或具有更小（或更少列）聚集索引的性能影响，所以我将尝试回答两者。
- Switching itself.自行切换。 When you switch clustered index to non-clustered, I believe it should not be expensive (I don't think the engine will actually shuffle blocks and extents to make heap).当您将聚集索引切换为非聚集索引时，我相信它应该不会很昂贵（我认为引擎实际上不会对块和范围进行洗牌以制作堆）。 Definitely IAM has to be changed which will take time.当然，IAM 必须更改，这需要时间。 Changing non-clustered index to clustered involves much more activity.将非聚集索引更改为聚集索引涉及更多活动。 In addition to moving data according to clustered index key, SQLServer has to update all non-clustered indexes.除了根据聚集索引键移动数据之外，SQLServer 还必须更新所有非聚集索引。
- Further impact (quite a large topic, I put a very short answer)... Smaller clustered index means less space needed for storing all other indexes which in turn means faster access to data and less resource consumption by the engine.进一步的影响（相当大的话题，我给出了一个非常简短的答案）......较小的聚集索引意味着存储所有其他索引所需的空间更少，这反过来意味着更快地访问数据并减少引擎的资源消耗。

Update I realized (thanks to Aaron Bertrand for pointing that out) I made quite vague statement regarding inclusion of clustered index to non-clustered indexes.更新我意识到（感谢Aaron Bertrand指出这一点）我对将聚集索引包含在非聚集索引中做了相当模糊的声明。 To be absolutely correct, each non-clustered index includes row locator which points to row.为了绝对正确，每个非聚集索引都包含指向行的行定位器。 When table is clustered, row locator is the clustered index key.当表被聚簇时，行定位器是聚簇索引键。 More info regarding clustered indexes: [1] , [2] .有关聚集索引的更多信息： [1] , [2] 。