[英]Near real time replica from Azure VM SQL Server/DB into Azure Blob Storage (CSV/JSON)
I am trying to achieve a near-real-time replica (within ~5 minutes ideally) of data between a source system (Azure VM with SQL Server - read only about 100 tables) into an Azure Storage Account (Gen 2, Blob folders) to support various upstream data workloads.我正在尝试在源系统(带有 SQL 服务器的 Azure VM - 仅读取大约 100 个表)之间实现数据的近实时副本(理想情况下在大约 5 分钟内)到 Azure 存储帐户(第 2 代,Blob 文件夹)支持各种上游数据工作负载。
I had considered Azure Data Factory to carry out an initial batch load of the historical data (takes ~40minutes using ADF), followed by an incremental "update" to the sink when source tables change (updates or inserts).我曾考虑使用 Azure 数据工厂来执行历史数据的初始批量加载(使用 ADF 大约需要 40 分钟),然后在源表更改(更新或插入)时对接收器进行增量“更新”。
The challenges are:挑战是:
What are the best possible approaches to establish this synchronization between sink and source?在接收器和源之间建立这种同步的最佳方法是什么?
You could start with Change Data Capture or Change Tracking , then run an SSIS job to write the data into blob storage.您可以从Change Data Capture或Change Tracking开始,然后运行 SSIS 作业将数据写入 blob 存储。 Or you could use something like Debezium .或者你可以使用像Debezium这样的东西。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.