简体   繁体   English

事实表建议

[英]Fact Table Recommendation

I have a data mart which only needs to capture a serial number of a product, the date of the activity, and where the activity took place (which account). 我有一个数据集市,只需要捕获产品的序列号,活动的日期以及活动发生的位置(哪个帐户)。

There are five possible activities. 有五种可能的活动。 The issue I have is this. 我的问题是这个。 Two of the activities take place at a warehouse level. 其中两项活动在仓库级别进行。 The remaining three take place at the account-level (WH does not apply). 其余三个发生在帐户级别(WH不适用)。 Ultimately however every warehouse rolls up to a master account. 最终,每个仓库最终都汇总到一个主帐户。

So if I had one fact table, I would essentially need two FK and you would have to traverse the fact table to build the WH > Account hierarchy which seems hard to maintain. 因此,如果我有一个事实表,则基本上需要两个FK,并且您必须遍历事实表才能构建WH>帐户层次结构,这似乎很难维护。 I'd like one dimension table. 我想要一维表。

Or is it then recommended I split this into two fact tables, even though the only different characteristic of either table is whether the activity took place at the warehouse or not. 还是建议我将其分为两个事实表,即使每个表的唯一不同之处是活动是否在仓库进行。

The goal of the reporting will be at the account level, but having the WH information may be useful at some point. 报告的目标将在帐户级别,但是在某些时候拥有WH信息可能会有用。 And I need to check for duplicates, etc which is why I was leaning towards the first, but don't know how to appropriately handle the hierarchies. 而且我需要检查是否有重复项,这就是为什么我倾向于第一个,但是不知道如何适当地处理层次结构的原因。

Single Fact Table Design 单事实表设计

  • Item: 1 项目:1
  • Account: 14 帐户:14
  • Warehouse:2 仓库:2
  • ActivityType:3 活动类型:3
  • Date: 20130204 日期:20130204
  • SerialNumber:123456 编号:123456
  • Count:1 数:1

Dual Fact Table Design 双事实表设计

Table 1 表格1

  • Item: 1 项目:1
  • Warehouse:2 仓库:2
  • ActivityType:3 活动类型:3
  • Date: 20130204 日期:20130204
  • SerialNumber:123456 编号:123456
  • Count:1 数:1

Table 2 表2

  • Item: 1 项目:1
  • Account:2 帐号:2
  • ActivityType:3 活动类型:3
  • Date: 20130204 日期:20130204
  • SerialNumber:123456 编号:123456
  • Count:1 数:1

Ive interpreted you situation as: 我已经将您的情况解释为:

  • ALL activities require an account 所有活动都需要一个帐户
  • Some activities involve a warehouse. 一些活动涉及仓库。
  • The selection of warehouse implies an account. 仓库的选择意味着一个帐户。 the accounts mentioned in the two point above are of the same type (there is only 1 account dimension table) 上面两点提到的科目是同一类型(只有1个科目维度表)

In which case you should be OK with the single FACT table design: 在这种情况下,您应该可以使用单个FACT表设计:

[ACTIVITY_FACT]
SK                    (Optional, i find unique surrogate PKs useful)
ITEM_SK               (Link to your ITEM_DIM table)
ACCOUNT_SK            (Link to your ACCOUNT_DIM table)
WAREHOUSE_SK          (Link to your WAREHOUSE_DIM table, -1 for no warehouse activities)
ACTIVITY_TYPE_SK      (Link to your ACTIVITY_TYPE_DIM table) 
ACTIVITY_DATE_SK      (Link to your DATE_DIM table)
ITEM_SERIAL_NUMBER
ITEM_COUNT

Have a record in your WAREHOUSE dimension for NONE or NOT APPLICABLE and allocate it a nice obvious special condition SK value of -1 or -9 or whatever your shop is using for such things. 在您的WAREHOUSE维度中有一个记录NONE或NOT APPLICABLE,并为其分配一个明显的特殊条件SK值-1或-9或您的商店用于此类事情的任何值。

For activity records that reference a warehouse, put the appropriate warehouse sk AND the account sk that belong to that warehouse. 对于引用仓库的活动记录,请放置适当的仓库sk和属于该仓库的科目sk。

For activities that do not involve a warehouse, populate the warehouse sk with the NONE / NOT APPLICABLE warehouse dimension record and the appropriate Account SK. 对于不涉及仓库的活动,请在仓库sk中填充NONE / NOT APPLICABLE仓库维记录和相应的帐户SK。

Now your fact table can be joined to your Account and Warehouse dimension tables without having to worry about outer join or null condition handling. 现在,您的事实表可以连接到“帐户”和“仓库”维表,而不必担心外部联接或空条件处理。 This should allow you and your users to play about with warehouse dimension data as required and your not having to faff about with managing two tables that contain essentially the same date. 这将使您和您的用户可以根据需要使用仓库维度数据,而不必费心管理两个包含相同日期的表。

A possibility is to define the hierarchy in a single dimension table. 一种可能性是在单个维表中定义层次结构。 Guessing at what you're dealing with, I came up with the following. 在猜测您要处理的内容时,我提出了以下建议。

Outline of dimension table: 尺寸表概述:

TABLE: Account

Account_ID  <surrogate key>
Account     <Account name, identifier>
Warehouse   (Warehouse name, identifier)

Sample data: 样本数据:

Account_ID   Account   Warehouse
    1           A        n/a
    2           B        n/a
    3           C        n/a
    4           W        wh1
    5           W        wh2
    6           Z        wh3
    7           Z        n/a

Account_ID is just a surrogate key, having no intrinsic meaning or value Account_ID只是一个代理密钥,没有内在的含义或值

Account lists the accounts. 帐户列出帐户。 Here, I shows five, A, B, C, W and Z. Select distinct to get the list of accounts; 在这里,我显示了五个,分别是A,B,C,W和Z。 join to a fact table by Account_ID where Account = “W” gets all data for that account (for however many warehouses, if applicable). 通过Account_ID加入一个事实表,其中Account =“ W”获取该帐户的所有数据(但适用于许多仓库)。

Warehouse lists all warehouses and the account they are associated with; 仓库列出所有仓库及其关联的帐户; here, “W” is the account for two separate warehouses (wh1, wh2); 在这里,“ W”是两个单独仓库(wh1,wh2)的帐户; Z is associated with warehouse wh3, but could also be used by a fact table with “no” warehouse. Z与仓库wh3相关联,但具有“无”仓库的事实表也可以使用Z。 Join to a fact table by Account_ID where Warehouse = “wh1” gets all data for that warehouse. 通过Account_ID加入一个事实表,其中Warehouse =“ wh1”获取该仓库的所有数据。

Using this, with Account_ID in a fact table you could drill down for all entries for any given Account or for a specific warehouse (or for no warehouse, if there is value in that). 使用此功能,在事实表中使用Account_ID,您可以细化任何给定帐户或特定仓库的所有条目(或如果没有价值,则无仓库)。

There are lots of variations and permutations possible with this kind of approach. 这种方法可能有很多变化和排列。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM