简体   繁体   中英

Nonclustered indices: Unique Index including Primary Key vs Non-Unique Index without Primary Key

The API I'm using to create indices on a table creates all nonclustered indizes as a unique index that includes the primary key.

So given the following table:

CREATE TABLE Person
(
    [PersonID] INT NOT NULL PRIMARY KEY, -- clustered key on this col.
    [First Name] VARCHAR(50),
    [Last Name] VARCHAR(50),
    [Other Columns] VARCHAR(50)
)

The SQL-Statement for a nonclustered index created by the API looks something like this:

CREATE UNIQUE NONCLUSTERED INDEX KeyName
ON Person ([First Name], [Last Name], [PersonID])

By including the Primary Key in the index it is assured that all entries are unique.

Personally I would have created the index like this:

CREATE NONCLUSTERED INDEX KeyName2
ON Person ([First Name], [Last Name])

I used the following statment twice on a table containing about 21 Million records. I only had one of the indices active at a time.

SELECT *
FROM Person
WHERE [First Name] = 'John'
AND [Last Name] = 'Doe'

The "Estimated CPU Cost", "Estimated I/O Cost", "Estimated Operator Cost" and "Estimated Subtree Cost" were exactly the same no matter what index was enabled/disabled.

My understanding of nonclustered indices is that the primary key isn't used to retrieve the data anyway but some row-number that SQL-Server assigns independently. So is there any benefit whatsoever to use the UNIQUE index over the NON-UNIQUE index?

I would guess that the UNIQUE index takes up more disk space since it contains one more column. Because of that I would prefer the latter version.

Using SQL-Server 2012

If PersonID is your Primary Key, there is no storage difference between

CREATE NONCLUSTERED INDEX KeyName
ON Person ([First Name], [Last Name])

and

CREATE NONCLUSTERED INDEX KeyName2
ON Person ([First Name], [Last Name], [PersonID])

The PersonID is implicitly added to KeyName to identify the data row.

In addition

CREATE UNIQUE NONCLUSTERED INDEX KeyName3
ON Person ([First Name], [Last Name], [PersonID)

and

CREATE NONCLUSTERED INDEX KeyName4
ON Person ([First Name], [Last Name], [PersonID])

are also functionally identical. Any time the primary key is explicitly included as a key column, the index is technically a unique index, even if it is only enforced by the primary key value. Adding the unique qualifier to KeyName3 is a waste of resources as SQL Server would still verify the uniqueness of the combination of all three columns, which is already guaranteed by the presence of the primary key.

The clustering key is included in every non-clustered index to point to the data row in the clustered index. If the clustering key is the primary key or a unique key, the key value is used to identify the data row. If the clustering key is not a unique value, SQL Server adds a unique value to each data row so it can identify them, though this is not truly a row number as there is padding to allow additional rows to be inserted/deleted. That additional value is considered to be after the explicit index keys for sorting purposes.

If there is no clustered index (a heap), an implicit unique value is used to point to the data row. However, without an index key to sort the data, the rows will never change order, so there is no padding and the unique value is effectively a row number.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM