简体   繁体   English

聚集指数

[英]Clustered Index

Which type of index(clustered/non clustrered) should be used for Insert/Update/Delete statement in SQL Server. SQL Server中的Insert / Update / Delete语句应使用哪种类型的索引(clustered / nonclustrered)。 I know it creates an additional overhead but is it better in performance as comparison to non clustered index? 我知道它会产生额外的开销,但与非聚集索引相比,它的性能是否更好? Also which index should be use for Select statements in SQL Server? 另外哪个索引应该用于SQL Server中的Select语句?

Not 100% sure what you're expecting to hear - you can only ever have a single clustering index on a table, and by default, every table (with very few edge case exceptions) should have one. 并非100%确定您期望听到的内容 - 您只能在表上拥有单个群集索引,并且默认情况下,每个表(边缘大小写异常很少)都应该有一个。 All indices typically help your SELECTs the most and some tend to hurt the INSERTs, DELETEs and possibly UPDATEs a bit (or a lot, if chosen poorly). 所有索引通常都会帮助你的SELECT最多,有些往往会伤害INSERT,DELETE和可能的UPDATE(或者很多,如果选择不当)。

A clustered index makes a table faster, for every operation. 对于每个操作,聚簇索引使表更快。 YES! 是! It does. 确实如此。 See Kim Tripp's excellent The Clustered Index Debate continues for background info. 请参阅Kim Tripp的优秀The Clustered Index辩论继续获取背景信息。 She also mentions her main criteria for a clustered index: 她还提到了她对聚集索引的主要标准:

  • narrow 狭窄
  • static (never changes) 静态(永不改变)
  • unique 独特
  • if ever possible: ever increasing 如果可能的话:不断增加

INT IDENTITY fulfills this perfectly - GUID's do not. INT IDENTITY完美地实现了这一点 - GUID不会。 See GUID's as Primary Key for extensive background info. 有关详细背景信息,请参阅GUID作为主键

Why narrow? 为何缩小? Because the clustering key is added to each and every index page of each and every non-clustered index on the same table (in order to be able to actually look up the data row, if needed). 因为聚簇键被添加到同一个表上的每个非聚集索引的每个索引页面(为了能够实际查找数据行,如果需要)。 You don't want to have VARCHAR(200) in your clustering key.... 您不希望在群集密钥中使用VARCHAR(200)....

Why unique?? 为什么独特? See above - the clustering key is the item and mechanism that SQL Server uses to uniquely find a data row. 请参阅上文 - 聚类键是SQL Server用于唯一查找数据行的项和机制。 It has to be unique. 它必须是独一无二的。 If you pick a non-unique clustering key, SQL Server itself will add a 4-byte uniqueifier to your keys. 如果您选择一个非唯一的群集键,SQL Server本身将为您的键添加一个4字节的唯一键。 Be careful of that! 小心那个!

Next: non-clustered indices. 下一篇:非聚集索引。 Basically there's one rule: any foreign key in a child table referencing another table should be indexed, it'll speed up JOINs and other operations. 基本上有一条规则:引用另一个表的子表中的任何外键都应该被索引,它将加速JOIN和其他操作。

Furthermore, any queries that have WHERE clauses are a good candidate - pick those first which are executed a lot. 此外,任何具有WHERE子句的查询都是一个很好的候选者 - 首先选择那些执行很多的子句。 Put indices on columns that show up in WHERE clauses, in ORDER BY statements. 在ORDER BY语句中将索引放在WHERE子句中显示的列上。

Next: measure your system, check the DMV's (dynamic management views) for hints about unused or missing indices, and tweak your system over and over again. 下一步:测量您的系统,检查DMV(动态管理视图)以获取有关未使用或缺失索引的提示,并反复调整您的系统。 It's an ongoing process, you'll never be done! 这是一个持续的过程,你永远不会完成!

Another word of warning: with a truckload of indices, you can make any SELECT query go really really fast. 另一个警告:使用大量的索引,您可以使任何SELECT查询真的非常快。 But at the same time, INSERTs, UPDATEs and DELETEs which have to update all the indices involved might suffer. 但与此同时,必须更新所有相关索引的INSERT,UPDATE和DELETE可能会受到影响。 If you only ever SELECT - go nuts! 如果你只选择SELECT - 坚果! Otherwise, it's a fine and delicate balancing act. 否则,这是一个精细而微妙的平衡行为。 You can always tweak a single query beyond belief - but the rest of your system might suffer in doing so. 您可以随时调整单个查询 - 但系统的其余部分可能会受此影响。 Don't over-index your database! 不要过度索引数据库! Put a few good indices in place, check and observe how the system behaves, and then maybe add another one or two, and again: observe how the total system performance is affected by that. 放置一些好的索引,检查并观察系统的行为,然后再添加一两个,然后再次:观察整体系统性能如何受到影响。

I am not quite sure what you mean by "should be used for Insert/Update/Delete statement" but in my opinion every table should have a clustered index. 我不太清楚你的意思是“应该用于插入/更新/删除语句”,但在我看来,每个表都应该有一个聚簇索引。 The clustered index specifies the order in which the data is actually stored. 聚集索引指定数据实际存储的顺序。 If a clustered index is not defined the data will simply be stored in a heap. 如果未定义聚簇索引,则数据将仅存储在堆中。 If you don't have a natural column to serve as you clustered index you could always just create an identity column as an int or bigint like this. 如果您没有自然列作为聚簇索引,那么您可以始终只创建一个标识列作为int或bigint。

CREATE TABLE [dbo].[demo](
[ID] [int] IDENTITY(1,1) NOT NULL,
[FirstName] [nchar](10) NULL,
[LastName] [nchar](10) NULL,
[Job] [nchar](10) NULL,
 CONSTRAINT [PK_demo] PRIMARY KEY CLUSTERED 
(
[ID] ASC
))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM