简体   繁体   English

如何设计交货数据事实表

[英]How to design a fact table for delivery data

I'm building a data warehouse that includes delivery information for restaurants. 我正在建立一个数据仓库,其中包括餐馆的送货信息。 The data is stored in SQL Server 2005 and is then put into a SQL Server Analysis Services 2005 cube. 数据存储在SQL Server 2005中,然后放入SQL Server Analysis Services 2005多维数据集中。

The Deliveries information consists of the following tables: 交货信息包括以下表格:

FactDeliveres 事实交付

  • BranchKey 分支键
  • DeliveryDateKey DeliveryDateKey
  • ProductKey 产品密钥
  • InvoiceNumber (DD: degenerate dimension) InvoiceNumber(DD:简并尺寸)
  • Quantity 数量
  • UnitCosT 单位成本
  • Linecost 线成本

Note: 注意:

  • The granularity of FactDeliveres is each line on the invoice FactDeliveres的粒度是发票上的每一行
  • The Product dimension include supplier information 产品维度包括供应商信息

And the problem: there is no primary key for the fact table. 问题是:事实表没有主键。 The primary key should be something that uniquely identifies each delivery plus the ProductKey. 主键应该是唯一标识每次交货的附加键以及ProductKey。 But I have no way to uniquely identify a delivery. 但是我无法唯一标识交货。

In the source OLTP database there is a DeliveryID that is unique for every delivery, but that is an internal ID that meaningless to users. 在源OLTP数据库中,有一个DeliveryID对于每个传递都是唯一的,但是对于用户而言,这是一个内部ID。 The InvoiceNumber is the suppliers' invoices number -- this is typed in manually and so we get duplicates. InvoiceNumber是供应商的发票编号-手动输入该编号,因此我们得到重复的编号。

In the cube, I created a dimension based only on the InvoiceNumber field in FactDeliveres. 在多维数据集中,我仅基于FactDeliveres中的InvoiceNumber字段创建了维度。 That does mean that when you group by InvoiceNumber, you might get 2 deliveries combined only because they (mistakenly) have the same InvoiceNumber. 这确实意味着,当您按InvoiceNumber分组时,您可能会合并2个交货,因为它们(错误地)具有相同的InvoiceNumber。

I feel that I need to include the DeliveryID (to be called DeliveryKey), but I'm not sure how. 我觉得我需要包括DeliveryID(称为DeliveryKey),但是我不确定如何。

So, do I: 我也是:

  1. Use that as the underlying key for the InvoiceNumber dimension? 使用它作为InvoiceNumber维的基础键吗?
  2. Create a DimDelivery that grows every time there is a new delivery? 创建一个DimDelivery,它在每次有新交付时都会增长? That could mean that some attributes come out of FactDeliveries and go into DimDelivery, like DeliveryDate,Supplier, InvoiceNumber. 这可能意味着某些属性来自FactDeliveries,并进入DimDelivery,例如DeliveryDate,Supplier,InvoiceNumber。

After all that, I could just ask you: how do I create a Deliveries cube when I have the following information in my source database 毕竟,我只想问你:当我的源数据库中包含以下信息时,如何创建一个Deliveries多维数据集

DeliveryHeaders 交货头

  • DeliveryID (PK) DeliveryID(PK)
  • DeliveryDate 邮寄日期
  • SupplierID (FK) 供应商编号(FK)
  • InvoiceNumber (typed in manually) InvoiceNumber(手动输入)

DeliveryDetails 交货细节

  • DeliveryID (PK) DeliveryID(PK)
  • ProductID (PK) 产品编号(PK)
  • Quantity 数量
  • UnitCosT 单位成本

I would have Quantity, UnitCode, InvoiceNumber, DeliveryID all in the fact table. 我将在事实表中包含数量,单位代码,发票编号,交货编号。 Both InvoiceNumber and DeliveryID are degenerate dimensions, because they will change with every fact (or very few facts). InvoiceNumber和DeliveryID都是简并维度,因为它们会随每个事实(或很少的事实)而变化。 It is possible that you could put them in their own dimension if you have a large number of items on each order. 如果每个订单上都有大量物品,则可以按它们自己的尺寸放置它们。 The model below may not be 100% correct if you have multiple deliveries on an invoice, but it will be close. 如果您的发票上有多次交货,则下面的模型可能不是100%正确的,但是它将很接近。 Check out Kimball, he might have an example of a star schema for this business scenario. 查看Kimball,他可能有一个针对此业务场景的星型架构示例。

Fact table:
OrderDateID (not in your model, but probably should be, date dimension in a role)
DeliveryDateID (date dimension in a role)
SupplierID (supplier dimension surrogate key)
InvoiceID (invoice dimension surrogate key)
ProductID (product dimension surrogate key)
Quantity (fact)
UnitCost (fact)
InvoiceNumber (optional)
DeliveryID (optional)

with the usual date dimension table and the following dimensions: 以及通常的日期维度表和以下维度:

Supplier Dim:
SupplierID (surrogate)
SupplierCode and data

Invoice Dim:
InvoiceID (surrogate)
InvoiceNumber (optional)
DeliveryID (optional)

Product Dim:
ProductID (surrogate)
ProductCode and Data

Always remember, your (star schema) data warehouse is not going to be structured at all like your OLTP data - it's all about the facts and what dimensions describe them. 永远记住,您的(星型架构)数据仓库根本不会像您的OLTP数据那样结构化-都是关于事实以及描述这些事实的维度。

Fact table PK's are almost always surrogate keys. 事实表PK几乎总是代理键。 Each fact is part of several dimensions, so the fact has FK's to the dimensions, but no real keys of it's own. 每个事实都是多个维度的一部分,因此事实具有维度的FK,但没有真正的密钥。

A Delivery Fact (a Line Item) belongs to a Branch, it has a Product, it is part of a larger Delivery, it occurs on a particular Date. 交货事实(订单项)属于分支机构,它具有产品,它是较大交货的一部分,它发生在特定日期。 Sounds like 4 independent dimensions. 听起来像4个独立的维度。

The Delivery dimension has it's own PK and it has a dimension attribute of invoice number. 交货维度具有自己的PK,并且具有发票编号的维度属性。 Plus, perhaps, other attributes of the delivery as a whole. 也许还有整个交付的其他属性。

Each Delivery Line Item Fact is associated with one Delivery and the invoice number for that Delivery. 每个交货行项目事实都与一个交货以及该交货的发票编号相关联。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM