简体   繁体   English

Guid 主键/外键困境 SQL Server

[英]Guid Primary /Foreign Key dilemma SQL Server

I am faced with the dilemma of changing my primary keys from int identities to Guid.我面临着将主键从 int 身份更改为 Guid 的困境。 I'll put my problem straight up.我会直截了当地提出我的问题。 It's a typical Retail management app, with POS and back office functionality.这是一个典型的零售管理应用程序,具有 POS 和后台功能。 Has about 100 tables.有大约100张桌子。 The database synchronizes with other databases and receives/ sends new data.数据库与其他数据库同步并接收/发送新数据。

Most tables don't have frequent inserts, updates or select statements executing on them.大多数表没有频繁的插入、更新或执行选择语句。 However, some do have frequent inserts and selects on them, eg.但是,有些确实有频繁的插入和选择,例如。 products and orders tables.产品和订单表。

Some tables have upto 4 foreign keys in them.有些表中最多有 4 个外键。 If i changed my primary keys from 'int' to 'Guid', would there be a performance issue when inserting or querying data from tables that have many foreign keys.如果我将主键从“int”更改为“Guid”,那么在从具有许多外键的表中插入或查询数据时是否会出现性能问题。 I know people have said that indexes will be fragmented and 16 bytes is an issue.我知道有人说过索引会碎片化并且 16 字节是一个问题。

Space wouldn't be an issue in my case and apparently index fragmentation can also be taken care of using 'NEWSEQUENTIALID()' function.在我的情况下,空间不会成为问题,显然也可以使用“NEWSEQUENTIALID()”函数来处理索引碎片。 Can someone tell me, from there experience, if Guid will be problematic in tables with many foreign keys.有人可以告诉我,从那里的经验来看,Guid 在具有许多外键的表中是否会出现问题。

I'll be much appreciative of your thoughts on it...我会很感激你对此的想法......

GUIDs may seem to be a natural choice for your primary key - and if you really must, you could probably argue to use it for the PRIMARY KEY of the table. GUID 似乎是您的主键的自然选择 - 如果您真的必须这样做,您可能会争辩说将它用于表的 PRIMARY KEY。 What I'd strongly recommend not to do is use the GUID column as the clustering key , which SQL Server does by default, unless you specifically tell it not to.我强烈建议不要使用 GUID 列作为集群键,SQL Server 默认情况下会这样做,除非您明确告诉它不要这样做

You really need to keep two issues apart:你真的需要把两个问题分开:

1) the primary key is a logical construct - one of the candidate keys that uniquely and reliably identifies every row in your table. 1)主键是一个逻辑结构 - 唯一且可靠地标识表中每一行的候选键之一。 This can be anything, really - an INT, a GUID, a string - pick what makes most sense for your scenario.这可以是任何东西,实际上 - 一个 INT、一个 GUID、一个字符串 - 选择最适合您的场景的内容。

2) the clustering key (the column or columns that define the "clustered index" on the table) - this is a physical storage-related thing, and here, a small, stable, ever-increasing data type is your best pick - INT or BIGINT as your default option. 2)聚簇键(定义表上“聚簇索引”的一列或多列)——这是一个物理存储相关的东西,在这里,一个小的、稳定的、不断增加的数据类型是你最好的选择——INT或 BIGINT 作为您的默认选项。

By default, the primary key on a SQL Server table is also used as the clustering key - but that doesn't need to be that way!默认情况下,SQL Server 表上的主键也用作集群键 - 但不必如此! I've personally seen massive performance gains when breaking up the previous GUID-based Primary / Clustered Key into two separate key - the primary (logical) key on the GUID, and the clustering (ordering) key on a separate INT IDENTITY(1,1) column.当将以前的基于 GUID 的主键/集群键分解为两个单独的键时,我个人看到了巨大的性能提升 - GUID 上的主(逻辑)键和单独的 INT IDENTITY 上的集群(排序)键(1, 1) 列。

As Kimberly Tripp - the Queen of Indexing - and others have stated a great many times - a GUID as the clustering key isn't optimal, since due to its randomness, it will lead to massive page and index fragmentation and to generally bad performance.正如金伯利·特里普Kimberly Tripp) - 索引女王 - 和其他人多次说过 - 作为集群键的 GUID 不是最佳的,因为由于它的随机性,它会导致大量的页面和索引碎片,并且通常会导致性能不佳。

Yes, I know - there's newsequentialid() in SQL Server 2005 and up - but even that is not truly and fully sequential and thus also suffers from the same problems as the GUID - just a bit less prominently so.是的,我知道 - 在 SQL Server 2005 及更高版本中有newsequentialid() - 但即使这样也不是真正和完全顺序的,因此也遇到与 GUID 相同的问题 - 只是不那么突出。

Then there's another issue to consider: the clustering key on a table will be added to each and every entry on each and every non-clustered index on your table as well - thus you really want to make sure it's as small as possible.然后还有另一个问题需要考虑:表上的聚簇键也将添加到表上每个非聚簇索引的每个条目中 - 因此您确实希望确保它尽可能小。 Typically, an INT with 2+ billion rows should be sufficient for the vast majority of tables - and compared to a GUID as the clustering key, you can save yourself hundreds of megabytes of storage on disk and in server memory.通常,对于绝大多数表来说,具有 2+ 十亿行的 INT 应该足够了 - 与作为集群键的 GUID 相比,您可以在磁盘和服务器内存上节省数百兆字节的存储空间。

Quick calculation - using INT vs. GUID as Primary and Clustering Key:快速计算 - 使用 INT 与 GUID 作为主键和集群键:

  • Base Table with 1'000'000 rows (3.8 MB vs. 15.26 MB)具有 1'000'000 行的基表(3.8 MB 与 15.26 MB)
  • 6 nonclustered indexes (22.89 MB vs. 91.55 MB) 6 个非聚集索引(22.89 MB 与 91.55 MB)

TOTAL: 25 MB vs. 106 MB - and that's just on a single table!总计:25 MB 与 106 MB - 这只是在一张桌子上!

Some more food for thought - excellent stuff by Kimberly Tripp - read it, read it again, digest it!还有一些值得深思的东西——金伯利·特里普 (Kimberly Tripp) 的优秀作品——阅读它,再读一遍,消化它! It's the SQL Server indexing gospel, really.这是 SQL Server 索引的福音,真的。

So if you really must change your primary keys to GUIDs - try to make sure the primary key isn't the clustering key, and you still have an INT IDENTITY field on the table that is used as the clustering key.因此,如果您真的必须将主键更改为 GUID - 尝试确保主键不是集群键,并且您在表上仍然有一个 INT IDENTITY 字段用作集群键。 Otherwise, your performance is sure to tank and take a severe hit .否则,您的表现肯定会受到重创。

Disadvantage of using guid over int:使用 guid 而不是 int 的缺点:

String values are not as optimal as integer values for performance when used in joins, indexes and conditions.在连接、索引和条件中使用时,字符串值的性能不如整数值最佳。 More storage space is required than INT.需要比 INT 更多的存储空间。

The generated GUIDs should be partially sequential for best performance (eg, newsequentialid() on SQL 2005) and to enable use of clustered indexes生成的 GUID 应该是部分顺序的以获得最佳性能(例如,SQL 2005 上的 newsequentialid())并启用聚集索引

for more detail :欲了解更多详情:

http://www.codinghorror.com/blog/2007/03/primary-keys-ids-versus-guids.html http://www.codinghorror.com/blog/2007/03/primary-keys-ids-versus-guids.html

http://blog.sqlauthority.com/2010/04/28/sql-server-guid-vs-int-your-opinion/ http://blog.sqlauthority.com/2010/04/28/sql-server-guid-vs-int-your-opinion/

My take is: Use autoincrement int as PK on the inside and have a unique Guid column on each primary table that you use to move rows across databases.我的看法是:在内部使用 autoincrement int 作为 PK,并且在用于跨数据库移动行的每个主表上都有一个唯一的 Guid 列。

Join this column when you export data, do not export the int, and map it back to int when you import data.导出数据时加入此列,不导出int,导入数据时映射回int。

Especially in large volumes, int are much smaller and faster.尤其是在大量的情况下, int 更小更快。

Using GUID or int for PK really depends on the scenario. PK 使用 GUID 或 int 确实取决于场景。 There will be a performance hit changing from INT to GUID.从 INT 更改为 GUID 的性能会受到影响。 GUID are 4 times bigger than an INT. GUID 比 INT 大 4 倍。 There is a good article here about the pros and cons of using GUIDs.这是一个很好的文章在这里关于使用的GUID的利弊。

Why do you have to change from Integers anyway?为什么你必须从整数改变呢?

GUIDs do have a performance impact relative to ints, but that impact may be minimal depending on your application so there's no way to be certain without testing.相对于整数,GUID 确实对性能有影响,但这种影响可能很小,具体取决于您的应用程序,因此没有测试就无法确定。 I once converted over an application from ints to GUIDs with some very large tables with many foreign keys doing both very heavy modifications and queries (on the order of hundreds of thousands of records turning over daily.) Things were a slower when run through a profiler, but there wasn't a noticeable difference from the user's perspective.我曾经将一个应用程序从整数转换为 GUID,其中包含一些非常大的表,其中有许多外键进行非常繁重的修改和查询(每天转换数十万条记录。)通过分析器运行时,事情会变慢,但从用户的角度来看并没有明显的差异。

So the answer is "it depends."所以答案是“视情况而定”。 Like all things dealing with performance, you can't really be sure until you try it.就像所有处理性能的事情一样,在尝试之前您无法确定。

bence eğer benzersiz bir kod kullanmamız gerekli durumlarda kullanılabilir. bence eğer benzersiz bir kod kullanmamız gerekli durumlarda kullanılabilir。 Ama performansa etkisinin göz önünde bulundurulmalıdır. Ama performansa etkisinin göz önünde bulundurulmalıdır。 Identıty bir pk ve fk olarak kullanırken performans açısından daha iyidir. Identıty bir pk ve fk olarak kullanırken performans açısından daha iyidir。 Bu yüzden duruma bağlı olarak guid ya clustered key kullanımı yapabiliriz. Bu yüzden duruma bağlı olarak guid ya 聚集密钥 kullanımı yapabiliriz。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM