简体   繁体   中英

Datawarehouse design

I am going to design a Datawarehouse (although its not an easy process). I am wondering through out the ETL process, how the data in the Datawarehouse is going to extract/transform to Data Mart? Are there any model design within Datawarehouse vs Datamart? Also usually starschema or snowflake?so should we place the table like in the following

In Datawarehouse dim_tableA dim_tableB fact_tableA fact_tableB

And in Datamart A dim_tableA (full copy from datawarehouse) fact_tableA (full copy from datawarehouse)

And in Datamart B dim_tableB (full copy from datawarehouse) fact_tableB (full copy from datawarehouse)

is it something real life example which can demonstrate the model difference between datawarehouse and datamart?

I echo with both Nick's responses and in more technical way following Kimball methodology:

In my opinion and my experience. At high level ,we have data marts like Service Analytics , Financial Analytics , Sales Analytics , Marketing Analytics ,Customer Analytics etc. These were grouped as below
Subject Areas -> Logical grouping(Star Modelling) ->Data Marts -> Dimension &Fact (As per Kimball’s)
Example:
AP Real Time ->Supplier, Supplier Transaction’s , GL Data -> Financial Analytics + Customer Analytics->Physical Tables

Data marts contain repositories of summarized data collected for analysis on a specific section or unit within an organization, for example, the sales department. ... A data warehouse is a large centralized repository of data that contains information from many sources within an organization.

Depending on their needs, companies can use multiple data marts for different departments and opt for data mart consolidation by merging different marts to build a single data warehouse later. This approach is called the Kimball Dimensional Design Method. Another method, called The Inmon Approach, is to first design a data warehouse and then create multiple data marts for particular services as needed.

An example: In a data warehouse, email clicks are recorded based on a click date, with the email address being just one of the click parameters. For a CRM expert, the e-mail address (or any other customer identifier) will be the entry point: opposite each contact, the frequency of clicks, the date of the last click, etc.

The Datamart is a prism that adapts the data to the user. In this, its keys to success depend a lot on the way the data is organized. The more understandable it is to the user, the better the result. This is why the titles of each field and their method of calculation must stick as closely as possible to the uses of the trade.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM