简体   繁体   中英

Data Archival of a few tables from MySQL database to Parquet file using Azure Data Factory

I am new to ADF and looking for suggestions on how to handle the following situation.

We have a requirement to archive data from few tables in MySQL database to parquet files and post-archival purge the entries in MySQL tables. The entries to be archived and purged are driven off of 1 main table in the database. (Meaning rest of the tables must be joined to this main table's PK and filter with creation timestamp of main table). Each table must create a parquet file in storage account.

  1. How do I identify and transfer the records from all tables which are related to entries of main table older than 6 months?
  2. How to handle transaction? Do not want to have partial tables completing data transfer for an entry in main table.

I tried creating copy activity, but dint know how to join all the tables in ADF. If I use stored procedure to join all the dependent tables, the result set is huge. Also, dint know how to split the result set to 1 parquet file per table.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM