简体   繁体   中英

Reaffecting dimension of fact table

I am starting to build a star schema, and I like it ^^

I have a design problem with dimensional modeling.

I have a Fact table for each transaction in the star schema (highest grain) Something like that (simplified version)

transaction_facts
- id
- account_dim
- date_dim
- status_dim
- amount

status_dim
- id
- code
- description
- final

For a transaction, the status is not clearly defined at process time. Most all of the status fall into these cases:

  • the transaction is ok
  • the transaction is ko
  • the transaction is ok, but to be confirmed.

The last status is the problematic one, since I can receive the confirmation of the transaction few days (up to 10 and sometime, even more) after the original transaction.

How should I handle this kind of late change? Intuitively, I would be tempted to just reaffect the existing transactions to the new dimension, but it make me think of 2 things:

  • Is it a good practice? (do not rewrite history etc...)
  • How to handle this kind of change in BigQuery or Redshift or any append only system ? On a very high amount of rows, it will be a problem since these systems don't work well with updates

If

  • this does not need to be a true "financial transaction" table AND
  • you don't need to keep a history of the values (eg what was the value as of some previous date)

Then you can/should update the value.

If using Redshift then you can do this efficiently by writing a batch of updates to a staging table (copy from s3) then applying these all in one go as an update.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM