简体   繁体   中英

MYSQL Change Data Capture(CDC) - Azure Services (Azure data factory)

I want to perform ETL operation on the data tables of MYSQL Database and store the data in the azure data warehouse. I do not have updated date column to identify a modified record over the period. How do I come to know which record is modified. Does MYSQL database support CDC?

It is possible to read the MYSQL binlogs or binary logs using azure services (Azure data factory)?

If you can put together a single statement query that will return what you want using whatever functions and joins are available to you then you can put that into the sqlReaderQuery part of the ADF.

Otherwise you might be able to use a stored procedure activity (sorry not so familiar with mySQL as I am ADF)

Do you have any column which is increasing integer? If so, you can still use lookup activity + copy activity + stored procedure activity to get incremental load. More details are as following: https://docs.microsoft.com/en-us/azure/data-factory/tutorial-incremental-copy-powershell

ADF do not have built-in support for CDC yet. You can do that through custom activity in ADF with your code.

In MySQL you have the option to add a timestamp column which updates on an update on rowlevel by default. A CDC is not available, but when you can to see the de difference you can compare the MAX(updatedate) on MySQL versus (>=) your own MAX(ETLDate) to get all the modified records.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM