简体   繁体   English

SQL服务器中使用非聚集索引

[英]Non clustered index use in SQL server

Can anyone tell me what is the use of having non clustered indexes in SQL server. 任何人都可以告诉我在SQL服务器中使用非聚集索引的用途。 As per my knowledge both the clustered and non clustered indexes make the searching easy.. 据我所知,聚簇和非聚簇索引使搜索变得容易..

One use is that you can only have one clustered index on a table. 一种用途是您只能在表上拥有一个聚簇索引。 If you want more than one, the rest have to be non-clustered. 如果您需要多个,则其余部分必须是非群集的。

Clustered index is how the data for each row of the table is physically stored on disk (you can only have one of these index types per table), so all write operations' performance is based off of this index. 聚簇索引是表的每一行的数据如何物理存储在磁盘上(每个表只能有一个这些索引类型),因此所有写操作的性能都基于此索引。 And if you have to rebuild this index or move stuff around on this index, that can be very expensive. 如果你必须重建这个索引或者在这个索引上移动东西,这可能会非常昂贵。

Nonclustered indexes are just a listing of specific parts of the rows in a different order than how they are physically stored (you can have multiple of these index types per table), and a pointer to where it is actually stored. 非聚簇索引只是行的特定部分的列表,其顺序与它们的物理存储顺序不同(每个表可以有多个这些索引类型),以及指向它实际存储位置的指针。 Nonclustered indexes are used to make it easy to find a specific row when you only know certain info about that row. 当您只知道有关该行的某些信息时,非聚簇索引用于使查找特定行变得容易。

If you think about a typical text book as a database table, the clustered index is the set of actual pages of content for that book. 如果您将典型的教科书视为数据库表,则聚簇索引是该书的实际内容页面集。 Because logically it makes sense to write those pages in that order. 因为从逻辑上讲,按顺序编写这些页面是有意义的。 And a nonclustered index is the index in the back of the book that list the important terms in alphabetical order. 非聚集索引是本书后面的索引,按字母顺序列出重要术语。 This just lists the word you are looking for, and the page number you can find it. 这只列出您要查找的单词,以及您可以找到它的页码。 This makes it extreamely easy for you to find what you need to read, when you are looking for a specific term. 这样,当您查找特定术语时,您可以轻松地找到需要阅读的内容。

Typically it is a good idea to make your clustered index an id that follows the NUSE principle (Narrow, Unique, Static, Ever increasing). 通常,最好使聚簇索引成为遵循NUSE原则的ID(窄,唯一,静态,不断增加)。 Typically, you would accomplish this with a SMALLINT, INT, or BIGINT depending on the amount of data you want to store in the table. 通常,您可以使用SMALLINT,INT或BIGINT来完成此操作,具体取决于您要在表中存储的数据量。 This gives you a narrow key because they are only 2, 4, or 8 bytes wide (respectively), you would also probably want to set the IDENTITY property for that column so that it auto increments. 这为您提供了一个窄键,因为它们只有2个,4个或8个字节宽(分别),您可能还希望为该列设置IDENTITY属性,使其自动递增。 And if you never change this value for a row (making it static) -- and there is usually no reason to do so -- then it will be unique and ever increasing. 如果你永远不改变一行的这个值(使它静止) - 并且通常没有理由这样做 - 那么它将是独一无二的并且不断增加。 This way, when you insert a new row, it just throws it at the next available spot on disk. 这样,当您插入新行时,它只会将其抛出到磁盘上的下一个可用位置。 Which can help with write speeds. 这有助于提高写入速度。

Nonclustered indexes are usually used when you use certain columns to search for the data. 当您使用某些列搜索数据时,通常会使用非聚簇索引。 So if you have a table full of people, and you commonly look for people by last name, you would probably want a nonclustered index on the people table over the last name column. 因此,如果您有一个满桌的人,并且您通常按姓氏查找人员,那么您可能希望在姓氏列上的人员表上使用非聚集索引。 or you could have one over last name, first name. 或者你可以有一个姓氏,名字。 If you also commonly search for people based off of their age, then you may want to have another nonclustered index over the birthdate column for people. 如果您通常也会根据其年龄搜索人员,那么您可能希望在人员的生日列中使用另一个非聚集索引。 That way you can easily search for people born above or below a certain date. 这样,您可以轻松搜索在特定日期之上或之下出生的人。

The classic example explaining the difference is one of a phone book. 解释差异的经典例子是电话簿之一。 The phone book, how it's physically structured from start to finish by Last Name (I think, it's been a while since I looked at a physical phone book) is analogous to the clustered index on a table. 电话簿,从姓名开头到结尾的物理结构(我认为,自从我查看实体电话簿以来已经有一段时间了)类似于桌面上的聚集索引。 You can only have one clustered index on a table. 表上只能有一个聚簇索引。 In fact, the clustered index IS the table; 实际上,聚集索引是表; it is how it's physically stored on disk. 它是如何物理存储在磁盘上的。 The structure of the clustered index contains the keys you define, plus ALL the data as well. 聚簇索引的结构包含您定义的键以及所有数据。 Side note, in SQL, you don't HAVE to have a clustered index at all; 注意,在SQL中,您根本不需要拥有聚簇索引; such a table is called a "Heap", but that's rarely a good idea. 这样的表被称为“堆”,但这不是一个好主意。

A nonclustered index by example would be if, say, you wanted to look up someone's entry in the phone book by address. 例如,如果您想通过地址在电话簿中查找某人的条目,那么非聚集索引将是一个例子。 You'd have an index at the back of the book with addresses sorted alphabetically, and then where in the phone book you can find that phone number. 你在书的后面有一个索引,地址按字母顺序排序,然后在电话簿中你可以找到那个电话号码。 Doing this is called a "lookup". 这样做称为“查找”。 So a nonclustered index has: 所以非聚集索引具有:

  • The keys you want to index (eg Address) 要索引的键(例如地址)
  • A pointer back to the row in the clustered index (the last name of the person at that address) 指向聚簇索引中的行的指针(该地址处人员的姓氏)
  • Optionally a list of included columns you might frequently need, but not want to have to go back to the clustered index to look up. (可选)您可能经常需要的包含列的列表,但不希望必须返回到聚簇索引以进行查找。

Whereas a clustered index contains ALL the data for each row, a nonclustered index is generally smaller because you only have your keys, your pointer and optionally included columns. 虽然聚簇索引包含每行的所有数据,但非聚簇索引通常较小,因为您只有键,指针和可选的列。 You can also have as many of them as you want. 你也可以随心所欲地拥有它们。

As far as how they return data, they're pretty similar, especially if you never have to do a lookup to the clustered index. 至于它们如何返回数据,它们非常相似,特别是如果您不必查找聚簇索引。 A query which can get everything it needs from a nonclustered index is said to be "covered" (in that all the stuff you need is covered by the nonclustered index). 一个可以从非聚集索引中获取所需内容的查询被称为“覆盖”(因为所有你需要的东西都被非聚集索引覆盖)。 Also, because clustered indexes are a linear ordering of the physical data, it makes range-based queries faster because it can find the start and end of the range simply by using an offset from the start of the clustered index. 此外,由于聚簇索引是物理数据的线性排序,因此它可以使基于范围的查询更快,因为它只需使用聚簇索引开头的偏移量即可找到范围的开始和结束。

The others seemed to have all touched on the same points, though I'll keep it short and provide a resource for you to get more information on this. 其他人似乎都触及了相同的观点,尽管我会保持简短并为您提供资源以获取更多相关信息。

A clustered index is the table, and it (obviously) includes all columns. 聚簇索引表,它(显然)包括所有列。 That may not always be what is needed and can be a hindrance when there are many rows of data in your result set. 这可能并不总是需要的,并且当结果集中有许多行数据时可能成为障碍。 You can utilize a non-clustered index (effectively a copy of part of the table) to "cover" your query so that you can get a quicker response time. 您可以利用非聚集索引(实际上是表的一部分的副本)来“覆盖”您的查询,以便您可以更快地获得响应时间。

Please check out this free video from world-class DBA, Brent Ozar: https://www.brentozar.com/training/think-like-sql-server-engine/ 请查看世界级DBA,Brent Ozar的免费视频: https ://www.brentozar.com/training/think-like-sql-server-engine/

Good luck! 祝好运!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM