简体   繁体   English

SQL Server-字典的聚集索引设计

[英]SQL Server - Clustered index design for dictionary

Would like some advice from this. 想从中得到一些建议。 I got a table where I want to keep track of an object and a list of keys related to the object. 我有一张要跟踪对象的表以及与该对象相关的键的列表。 Example: 例:

OBJECTID   ITEMTYPE   ITEMKEY
--------   --------   -------
1          1          THE
1          1          BROWN
1          2          APPLE
1          3          ORANGE
2          2          WINDOW

Both OBJECTID and ITEMKEY have high selectivity (ie the OBJECTID and ITEMKEY are very varied). OBJECTID和ITEMKEY都具有很高的选择性(即OBJECTID和ITEMKEY差异很大)。 My access are two ways: 我的访问方式有两种:

  • By OBJECTID: Each time an object changes, the list of key changes so a key is needed based on OBJECTID. 通过OBJECTID:每次对象更改时,键列表都会更改,因此需要基于OBJECTID的键。 Changes happen frequently. 变化经常发生。

  • By ITEMKEY: This is for keyword searching and also happens frequently. 通过ITEMKEY:这是用于关键字搜索的,并且也经常发生。

So I probably need two keys, and choose one for clustered index (the one that is more frequently accessed, or where I want the speed to be, for now lets assume i will prioritize OBJECTID for clustered). 因此,我可能需要两个键,然后为聚簇索引选择一个(这是更频繁访问的一个,或者是我想要的速度,现在让我们假设我将为聚簇设置OBJECTID的优先级)。 What I am confused about is how I should design it. 我感到困惑的是我应该如何设计它。

My questions is, which is better: 我的问题是,哪个更好:

a) A Clustered index of (OBJECTID,ITEMTYPE,ITEMKEY), and then an index of (ITEMKEY). a)(OBJECTID,ITEMTYPE,ITEMKEY)的聚集索引,然后是(ITEMKEY)的索引。 My concern is that since a clustered index is so big (2 ints, 1 string) the index will be big, because all index items got to point back to the clustered key. 我担心的是,由于聚集索引太大(2个整数,1个字符串),因此索引将很大,因为所有索引项都必须指向聚集键。

b) Create a new column with a running identity DIRECTORYID (integer) as primary key and clustered index, and declare two index for (OBJECTID,ITEMTYPE,ITEMKEY) and just (ITEMKEY). b)创建一个具有运行标识DIRECTORYID(整数)作为主键和聚集索引的新列,并声明两个索引,分别为(OBJECTID,ITEMTYPE,ITEMKEY)和(ITEMKEY)。 This will minimize index space but have higher lookup costs. 这将使索引空间最小化,但查找成本更高。

c) A Clustered index of (OBJECTID,ITEMTYPE,ITEMKEY), and a materialized view of (ITEMKEY,ITEMTYPE,OBJECTID) on it. c)(OBJECTID,ITEMTYPE,ITEMKEY)的聚集索引,以及(ITEMKEY,ITEMTYPE,OBJECTID)的物化视图。 My logic is that this is avoids a key lookup and will still be just as big as the index with a lookup in a), at cost of higher overhead. 我的逻辑是,这避免了键查找,并且仍将与在a)中进行查找的索引一样大,但开销更高。

d) Err...maybe there is a better way given the requirements? d)错误……鉴于需求,也许有更好的方法吗?

Thanks in advance, Andrew 预先感谢,安德鲁

If ever possible, try to keep your clustered key as small as possible, since it will be also added to all non-clustered indices on your table. 如果可能,请尝试使集群键尽可能小,因为它也会被添加到表中的所有非集群索引中。

Therefore, I would use an INT if ever possible, or possibly a combination of two INT - but certainly never a VARCHAR column - especially if that column is potentially wide (> 10 chars) and is bound to change. 因此,如果可能,我将使用INT,或者可能使用两个INT的组合-但绝对不要使用VARCHAR列-尤其是如果该列可能很宽(> 10个字符)并且势必会发生变化。

So of the options you present, I personally would choose b) - why?? 因此,在您提出的选项中,我个人会选择b)-为什么?

Adding a surrogate DirectoryID will satisfy all crucial criteria for a clustering key: 添加代理DirectoryID将满足集群键的所有关键条件:

  • small
  • stable 稳定
  • unique 独特
  • ever-increasing 不断增加

and your other non-clustered indices will be minimally impacted. 而您的其他非聚集索引将受到最小的影响。

See Kimberly Tripp's outstanding blog post on the main criteria for choosing a good clustering key on your SQL Server tables - very useful and enlightening! 有关在SQL Server表上选择良好的群集键的主要标准,请参见Kimberly Tripp的出色博客文章 -非常有用且有启发性!

To satisfy your query requirements, I would add two non-clustered indices, one on ObjectID (possibly including other columns frequently needed), and another on ItemKey to search by keyname. 为了满足您的查询要求,我将添加两个非聚集索引,一个在ObjectID (可能包括经常需要的其他列),另一个在ItemKeyItemKey进行搜索。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM