简体   繁体   中英

How to handle deleted records (from source) in the Data Vault model?

We are building a Data Vault (2.0) model to capture SalesForce data. Like many other sources, the records in the source are soft deleted. While we are sourcing data to the Data Model, we do not want to filter any data & also capture deleted records in the target system. Searched for the best practice to handle deleted records in the Data Vault model, however no luck. Can someone please throw some light here? Should we add IsDeleted flag in Hub or Satellite considering the future expansion of the model & best design practices to follow. Also, any reference material links here would be of great help. Thank you.

In the DV2.0 specification, there's a "Record source tracking satellite" that does exactly what you want to do, keep tracks of inserted, updated and deleted records per source (p.143 if you have the book).

Basically, this is a satellite with a hashkey, loaddate, record source and a status (I/U/D). You insert a record each time a record is added, updated or deleted in a source system.

In DV1.0, there was an "last seen" field, but has been removed for performance issue ( https://danlinstedt.com/allposts/datavaultcat/end_of_updates/ )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM