简体   繁体   中英

Near real time replica from Azure VM SQL Server/DB into Azure Blob Storage (CSV/JSON)

I am trying to achieve a near-real-time replica (within ~5 minutes ideally) of data between a source system (Azure VM with SQL Server - read only about 100 tables) into an Azure Storage Account (Gen 2, Blob folders) to support various upstream data workloads.

I had considered Azure Data Factory to carry out an initial batch load of the historical data (takes ~40minutes using ADF), followed by an incremental "update" to the sink when source tables change (updates or inserts).

The challenges are:

  1. Some of the source tables are updated historically (eg a record from two years ago is added)
  2. Some of the source tables are not transactional tables (they are lookup tables without timestamp columns, "LastUpdatedOn" for example does not exist in these tables).

What are the best possible approaches to establish this synchronization between sink and source?

You could start with Change Data Capture or Change Tracking , then run an SSIS job to write the data into blob storage. Or you could use something like Debezium .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM