简体   繁体   English

如何填充事实表?

[英]How can I populate my Fact table?

I am not a seasoned BI developer so I need help populating my fact table. 我不是经验丰富的BI开发人员,因此我需要在填充事实表方面的帮助。 Firstly, I have populated all my Dimensions from my production database (I'm not using a staging database or tables) using the appropriate SSIS components. 首先,我已经使用适当的SSIS组件从生产数据库(未使用暂存数据库或表)中填充了所有我的尺寸。

DimParent , DimStudent , DimManager , and DimFacilitator use the natural key as the primary key. DimParentDimStudentDimManagerDimFacilitator使用自然键作为主键。 The rest of the dimensions use a surrogate key as the primary. 其余维度使用代理键作为主要键。 The reason for using the natural keys is because I have the same database model for my production(OLTP) database over multiple different schemas (which act as my different Campus locations). 使用自然键的原因是因为在多个不同的方案(充当我不同的园区位置)上,我的生产(OLTP)数据库具有相同的数据库模型。 DW图

My measureable data is still in my production database and I can't seem to figure out how to populate my fact table. 我的可测量数据仍在生产数据库中,我似乎无法弄清楚如何填充事实表。 生产OLTP

I was thinking of using a large query with a join but it might get too complex regarding the way I populated my DimAssessmentType by using the query: 我当时想使用带有联接的大型查询,但是对于使用查询填充DimAssessmentType的方式,它可能变得太复杂了:

select PK_Assessment, [Description] 
from Auckland_Park.Assessment 
union 
select 3, 'International'

Don't be inconsistent. 不要前后矛盾。 Use surrogate keys for everything. 对所有内容使用代理键。 Then no matter what happens (ie a campus comes online that does not follow this rule), you can account for it. 然后,无论发生什么情况(即不符合此规则的校园上线),您都可以考虑。 Being inconsistent is just making work for yourself. 前后矛盾只是为自己工作。 Do the design right now. 立即进行设计。 It's a hell of a job reloading a dimension and fact after you've got three years data in your fact. 在获得了事实的三年数据之后,重新加载维度和事实真是一件令人头疼的工作。

Anyway. 无论如何。 The way I populate a fact is: 我填充事实的方式是:

  1. Load the facts into a staging table. 将事实加载到登台表中。
  2. The staging table has additional columns which contain your surrogate keys 登台表还有其他列,其中包含您的代理键
  3. Run an update statement on your staging table that fills in the surrogate keys 在登台表上运行一个更新语句,该语句会填充代理键
  4. Pick an appropriate window in your fact. 根据实际情况选择合适的窗口。 Delete and reload that window 删除并重新加载该窗口

It sounds like you might want to do an "inline lookup" in SSIS instead to find SK's. 听起来您可能想在SSIS中进行“内联查找”而不是查找SK。 That's fine but it does make it difficult to troubleshoot. 很好,但是确实很难排除故障。 Also the SSIS lookup component doesn't scale well (ie works for few rows and is very very slow for many rows). 另外,SSIS查找组件的伸缩性不好(即,仅适用于几行,而对于许多行来说则非常慢)。 Also it doesn't do SCD's very well. 同样,它也不能很好地完成SCD。

Your statement " I can't seem to figure out..." is very vague. 您的陈述“我似乎无法弄清楚……”非常模糊。 Follow the four steps above and tell me which one you can't figure out. 请按照上述四个步骤,告诉我您不知道哪一个。

One issue might be that you are not preserving source system keys in your dim... so you can't look up the new surrogate keys based on source system keys. 一个问题可能是您没有在昏暗的地方保留源系统密钥...,因此您无法基于源系统密钥查找新的代理密钥。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM