简体   繁体   中英

Rapidly Changing Dimension

Recently, I came across the concept of Rapidly Changing Dimensions ( Mini dimensions ).

I understand the part where the fast-changing attributes are removed from the main dimension table and put into a junk dimension (with all possible combinations of the values in those attributes. This junk dimension will be connected to the parent dimension table by an intermediate " bridge-table " (mini dimension) which will contain the parent dimension key and the junk dimension surrogate key (along with start & end dates).

However, I failed to comprehend how it is implemented in real-life.

So, say if an RCD attribute changes , then is the record in the mini dimension (or parent dimension) is updated with the new SK from the junk dim? If yes, then how do we track the history in such a scenario, as we are destructively updating the same record existing in the mini dimension value?

Alternately, if a "new" record is created in the mini dimension (like SCD-2) containing the sk of the new junk dim record, then we are again having the same problem of the size of the mini dimension increasing with time. Also, does the fact hold the ik of only the parent dim or both the parent dim and the junk dim sks?

Can anyone please clarify with an example?

Assume there are 4 tables in the DW model:
1. PAT_DIM is the parent dimension
2. PAT_JNK_DIM is the junk dimension containing the RCD attributes
3. PAT_MINI_DIM is the mini-dim bridge table between 1 & 2 (above).

PAT_DIM:  
--------  
pat_dim_sk, 
pat_id,
pat_dob,
blood_type

PAT_MINI_DIM:  
------------  
pat_id,
pat_rcd_sk,
start_date,
end_date

PAT_JNK_DIM:  
----------  
pat_rcd_sk,
pat_weight,
pat_bmi

Given the above example can anyone please help me understand how the Rapidly Changing Dimension (RCD) is modeled in the real-world scenario. How are the RCD tables inter-connected in the Data Warehouse.

Generally speaking Junk Dimension - consists of low cardinality flags and indicators in Dimensional Model.

In your case, you don't need a bridge table. The mini-dimension should contain the RCD and should be directly joined to the FACT Table.

Depending on the requirement SCD Type 1 or Type 4 may be best suited for RCDs.

But if you need to implement it as SCD type 2 - then there are two options of joining Fact and this RCD and it also dependent on data model design.

Option 1 : If in the design you have DIM SK in FACT table, then you need to self the join the DIM to table to get the latest RCD where the Current Flag is true based on the natural key of DIM row. (Note there is no update to FACT table when RCD is changing).

Option 2: In your design you can create a new key (durable supernatural key /stable Key) for DIM and this key is carried to all the future version of the dim row. In the FACT table you keep the both the SK of DIM and Durable key of the DIM. So when you join can avoid the self join of Dim using the Durable Key and the Current Flag is true to get the latest DIM row. And if you need previous version of DIM row you can just join by the DIM SK.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM