简体   繁体   English

事实表中的物理Pkey

[英]Physical Pkey in Fact table

I was in an interview. 我在接受采访。 I did some code for them and they were keen on why there is no PKEY in Fact table, why there is duplicate data. 我为他们做了一些代码,他们很想知道为什么事实表中没有PKEY,为什么有重复数据。 In my opinion, FACT holds foreign keys from dim and there is no need of Physical PKEY. 在我看来,FACT可以保存昏暗的外键,并且不需要物理PKEY。 And on obvious point foreign key column will have duplicates. 并且在明显点上,外键列将具有重复项。 That whats its purpose is. 那是什么目的。 To show me different fact across different stages. 给我展示不同阶段的不同事实。 Now, Logically some composite key can make up as primary in fact table but is it good scenario to have that as physical in database. 现在,从逻辑上讲,某些组合键实际上可以构成主表,但是在数据库中将其作为物理表是一个好方案。

Summarizing my question. 总结我的问题。 1. Does FACT should have primary key physically? 1. FACT物理上是否应具有主键? 2. Can we have physical PKEY on set of fkey column( i dont think ms sql will allow this) ? 2.我们可以在fkey列的集合上使用物理PKEY吗(我不认为ms sql会允许这样做)? 3. Does FACT should have surrogate key just for a sake of a pkey? 3. FACT是否应该仅出于pkey的要求就具有代理密钥? We can have ordering on other important column like date? 我们可以在日期等其他重要列上订购吗?

Response is awaited want to understand the different opinion on this. 期待有待对此有所不同的意见。

I am going to assume that when the interviewer asked about a primary key for a fact table, they were asking whether it needed a surrogate primary key (ie a unique number, usually generated by a sequence or auto-increment). 我要假设,当访调员询问事实表的主键时,他们在询问它是否需要代理主键(即唯一的数字,通常由序列或自动递增生成)。

Within the Kimball methodology surrogate primary keys are used in dimension tables. 在Kimball方法中,替代主键用于维度表中。 With few exceptions, a fact table does not need a surrogate primary key. 除了少数例外,事实表不需要代理主键。 A fact table has a primary key but it is a composite key made up of a subset of the foreign key columns pointing back to the dimensions, and this makes a unique identifier suitable as a primary key. 事实表具有主键,但它是由指向维度的外键列的子集组成的组合键,这使得唯一标识符适合作为主键。 This key is physical in that you define it when creating the table and databases typically build an index for the defined primary key. 此键是物理的,因为您在创建表时定义它,而数据库通常会为已定义的主键建立索引。

Exceptions to this generalization are: 这种概括的例外是:

  • Sometimes the business rules allow for identical fact rows. 有时,业务规则允许使用相同的事实行。 In this case, you need a surrogate key to uniquely identify a fact record. 在这种情况下,您需要一个代理密钥来唯一标识事实记录。
  • Some ETL tools perform better if you have a surrogate primary key, most especially when you face a need for ETL to update/insert a row and then delete a previous fact record. 如果您具有代理主键,则某些ETL工具的性能会更好,尤其是当您需要ETL更新/插入一行然后删除先前的事实记录时,这种情况尤其如此。

In these cases, a surrogate primary key is beneficial. 在这些情况下,代理主键是有益的。 However, it's not something you expose to the end user, it's merely a convenience to meet technical needs. 但是,这不是您要暴露给最终用户的东西,它仅仅是满足技术需求的一种便利。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM