简体   繁体   中英

Data Warehouse Design/Modeling (based on Figure in Data Mining textbook)

I found a schema in Google Images (see below) that can illustrate a problem I having in my data warehouse design:

在此处输入图片说明

My design is different, but this is the simplest figure I could find to convey my question, which is given the figure, I'm wondering how could the schema accommodate the following scenario: if a product had a unique number assigned to it by the SalesOrg (salesOrg_product_number)...For example, a salesOrg sells food items and assigns all food items of the same kind the same unique salesOrg_product_number. A different salesOrg would have a different salesOrg_product_number for that type of product.

I'm inclined to place the salesOrg_product_number attribute in the Product dimension table, but part of me thinks it should be in the salesOrg dimension table instead. I'm wondering which one of these is correct way in a data warehouse (not relational db) design to maintain the star schema?

In a perfect world the Primary Keys of a dimension table should be just surrogate key, without any meaning for the business. Table IDs should be invisible for the final users, but business code should be of course available.

A possible solution would be to have a product table with a structure like:

Product_id
Product_desc
Product_SO1_number
Product_SO2_number
...

Of course this will require to show the correct field to the correct Sales Organization. Depending on your reporting tool this can be more or less difficult. For example if you write your query manually you need just to put the right column in your select.

Another possibility would be to have a product/sales_org table, a table which combine the Product and the Sales_Org one:

Product_Sales_Org_id
Product_id
Sales_Org_id
Product_SO_number
...

This table will be child of the two dimension table and on the fact table you will have Product_Sales_Org_id column. Depending on Product and Sales Organization the Product_SO_number will return the correct number per SO.

If you want to have this in a star schema structure you can put Product/Sales_Org/Product_Sales_Org together in only one table like:

Product_Sales_Org_id
Product_id
Sales_Org_id
Product_desc
Sales_Org_desc
Product_SO_number
...

Sincerely I would go for the second solution, keep the Product and the Sales_Org tables separated, because they are two different business entities and implement the relationship table in the middle.

I hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM