简体   繁体   中英

Data warehouse with unpivoted data

I am building a data warehouse for the company's (which I am working for) core ERP application, for a particular client.

In the source database most of the dimension information in the data warehouse are stored in an unpivoted manner basically since the application is a product which is to be customized on the client's request.

For the current client I am working with, I can unpivot and extract the data. But my concern is, if we are going to reuse the data warehouse (with other customers too) then I think depending on the way they classify the fields the data warehouse model will not be able to adjust and further customization would require.

Do let me know whether there is any competent mechanism to overcome this design issue.

Following is an example of the way the products are classified in the source database (this applies to most of the other master data classifications too),

Product Code  MasterClassification  MasterClassificationValue
------------  --------------------  -------------------------
AAA           Brand                 AA
AAA           Category              A

Same set of data pivoted:

Product Code  Brand  Category
------------  -----  --------
AAA           AA     A

Thanks in advance.

This is a classic and well documented data problem. What you describe as 'unpivoted' is known as EAV. I suggest you google 'EAV' prehaps together with 'reporting'. You are not alone!

It makes sense that the dimensional data in the source system is stored is unpivoted -- it's a database, so it should be normalized. How you handle it in the data warehouse is another question.

In a previous job, we debated whether and how we should carry pivoted / denormalized / "wide and shallow" data. In our implementation, every table brought with it a view (containing the ETL logic) and a procedure (to load the table). That's a lot of infrastructure, so we thought twice before adding another table. Also, the requirement for pivoted data often came from the analytics team for use in Tableau, a tool that easily consumes unpivoted / "narrow and deep" data and pivots it -- so we often debated whether pivoted data was actually required.

Eventually we decided that we would occasionally carry pivoted data but only via a reporting view. (We had naming conventions to distinguish reporting views from ETL views.) I think this is an approach you should consider, for reasons you mentioned yourself: new categories could be added, rendering your pivoted design outdated. Also, if you have multiple clients using this data, each client could be interested in a different set of categories. You could cast a customized pivoted reporting view on top of this table for each client. That sounds like a lot of work, but I think it's less work than redoing a pivoted table every time you become aware that a new category has been added. Good luck!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM