简体   繁体   中英

Copying data incrementally (delta data) from an API endpoint by HTTP GET to an Azure SQL DB on Azure Data Factory

I am trying to perform an ETL activity in which data hosted in JSON at an API is incrementally copied into an Azure SQL Database table. The problem I'm having is that I am not sure how to account for the new/changed entries. I don't want to delete everything and do a massive copy each time the pipeline is run... Are there any suggestions? The only help I've been able to find thus far cover scenarios in which an Azure SQL Database table is the source instead of the sink...

[Requires ODATA or filtering capability]

The common solution to this problem is to have one (or two fields):

UpdatedDate/CreatedDate

and fetch filtering by UpdatedDate/CreatedDate >= LastSuccessfulSyncDate

You need to modify UpdatedDate/CreatedDate field every time a row is changed. To achieve that, use database triggers or in your application logic.

You can also look at better communication between your application processes using a message broker like RabbitMQ or Azure Service Bus

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM