简体   繁体   English

使用带有SQL Server索引的INCLUDE列有什么好处?

[英]What is the advantage of using an INCLUDE column with a SQL Server index?

CREATE NONCLUSTERED INDEX index_name 
    ON <object> ( column [ ASC | DESC ] [ ,...n ] ) 
    [ INCLUDE ( column_name [ ,...n ] ) ]
    [ WHERE <filter_predicate> ]

In the above syntax we will specify non-key columns in the INCLUDE. 在上面的语法中,我们将在INCLUDE中指定非键列。 What is the advantage in specifying non-key columns? 指定非键列有什么好处?

Let's assume you have an employee table like so: 假设你有一个像这样的员工表:

CREATE TABLE Employee(EmployeeID INT IDENTITY(1,1) PRIMARY KEY,
                      LastName VARCHAR(50),
                      FirstName VARCHAR(50),
                      HireDate DATETIME,
                      Salary DECIMAL)

You would have the primary clustered key on EmployeeID, and possibly a non-clustered key on (LastName,FirstName) in order to be able to find employees by name. 您将拥有EmployeeID上的主群集密钥,并且可能在(LastName,FirstName)上具有非群集密钥,以便能够按名称查找雇员。

CREATE INDEX NameIndex ON Employee(LastName ASC, FirstName ASC)

Now if you need to find "Joe Murphy" and retrieve his hire date and salary, what happens is an index seek in your name-based non-clustered key (which is good), but then in order to fetch the hire date and salary, SQL Server needs to do a so-called bookmark lookup into the actual table data to get the record for Joe Murphy. 现在,如果您需要找到“Joe Murphy”并检索他的雇用日期和薪水,那么会发生什么是基于名称的非群集密钥(这很好)的索引搜索,但是为了获取雇用日期和薪水,SQL Server需要对实际的表数据进行所谓的书签查找,以获取Joe Murphy的记录。 This will most likely incur one or several physical disk accesses (which is bad in terms of performance). 这很可能会导致一个或多个物理磁盘访问(在性能方面不好)。

HOWEVER: if your name-based non-clustered index also specifies "INCLUDE (HireDate, Salary)": 但是:如果基于名称的非聚集索引还指定了“INCLUDE(HireDate,Salary)”:

CREATE INDEX NameIndex ON Employee(LastName ASC, FirstName ASC)
       INCLUDE (HireDate, Salary)

then SQL Server is done once it's looked up Joe Murphy in the non-clustered name index --> all the fields to satisfy your query are in the non-clustered index, so there's no more need to do a disk-intensive bookmark lookup and your queries will be potentially much faster. 一旦查询了Joe Murphy,就会在非聚集名称索引中完成SQL Server - >满足查询的所有字段都在非聚集索引中,因此不再需要进行磁盘密集型书签查找和您的查询可能会更快。

The downside of INCLUDE columns are increased disk-space need by non-clustered indexes, since they will have the included columns in their leaf-level nodes. INCLUDE列的缺点是非聚集索引增加了磁盘空间需求,因为它们的叶级节点中将包含列。 It's a trade-off between speed and size (as usual). 这是速度和尺寸之间的权衡(像往常一样)。

Marc

When an index is used, it can either lead to the location of the complete record, or all the fields you need can be stored in the index itself (so you don't need to go to complete record.) 使用索引时,它可以指向完整记录的位置,或者您需要的所有字段都可以存储在索引本身中(因此您无需转到完整记录。)

In the first case, you have two reads best case, one to read the index, and one to read the complete record. 在第一种情况下,您有两个读取最佳案例,一个读取索引,一个读取完整记录。

In the second case, you get everything you need from reading the index. 在第二种情况下,您可以通过阅读索引获得所需的一切。

You can INCLUDE extra columns to be stored with the index if, together with the field(s) comprising the index, you have complete queries satisfied without needing the extra read for the row. 您可以包含要与索引一起存储的额外列,如果与包含索引的字段一起使您满足完整查询,则无需对该行进行额外读取。

This is not an issue for the CLUSTERED index because reading the index is the same as reading the entire row. 这不是CLUSTERED索引的问题,因为读取索引与读取整行相同。

There is an especially big advantage if you are reading several index keys in a sequence (eg SELECT ... FROM ... WHERE keys BETWEEN n1 AND n2) because the index reads will likely be stored adjacent to each other and can be read with perhaps only one or two physical sectors; 如果您正在读取序列中的多个索引键(例如SELECT ... FROM ... WHERE键BETWEEN n1和n2),则有一个特别大的优点,因为索引读取可能会彼此相邻存储并且可以使用也许只有一两个物理部门; and not needing to find the rest of the records provides more leverage. 而不需要找到其余的记录提供更多的杠杆作用。

This will store the data for the column with the index node for fast access on queries that use the index. 这将使用索引节点存储列的数据,以便快速访问使用索引的查询。

Reference : 参考

INCLUDE ( column [ ,... n ] ) INCLUDE(专栏[,... n])

Specifies the non-key columns to be added to the leaf level of the nonclustered index. 指定要添加到非聚簇索引的叶级别的非键列。 The nonclustered index can be unique or non-unique. 非聚集索引可以是唯一的也可以是非唯一的。

Advantage of Include Index is to avoid size limit . 包含索引的优点是避免大小限制。

SELECT AddressLine1, AddressLine2, City, StateProvinceID, PostalCode
FROM Person.Address
WHERE PostalCode BETWEEN N'98000' and N'99999'

Although you could define all columns as key columns, the key size would be 334 bytes.Only criteria we use here is Postalcode , Using Postalcode in the keycolumn and rest on non-key column will improve the performance and also the saves the size . 虽然您可以将所有列定义为键列,但密钥大小将为334个字节。我们在此使用的唯一标准是Postalcode,在密钥列中使用Postalcode并在非键列上休息将提高性能并节省大小。

CREATE INDEX IX_Address_PostalCode
ON Person.Address (PostalCode)
INCLUDE (AddressLine1, AddressLine2, City, StateProvinceID);

It removes the need to do a key lookup if the query uses the included column. 如果查询使用包含的列,则无需执行键查找。 Example. 例。

select ssn, firstname from myusers where ssn='111-11-1111'

An appropriate index would look like this 适当的索引看起来像这样

create index idx_user_ssn nonclustered on myusers(ssn) include(firstname)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM