简体   繁体   中英

Azure Data Factory- Copy specific files from multiple Parent folders from FTP Server

I am trying to copy the .ZIP files from FTP Server to Azure DataLake. I need to copy specific files from specific parent folders(Totally i have 6 parent folders in the FTP)and this pipeline needs to scheduled. So how should i provide the parameters such that Pipeline should select only the specific files from the different folders?

I have used Metadata Activity and tried creating pipelines but not sure how to provide the pipeline to pick only specific files!

Azure Data Factory supports compress/decompress data during copy. When you specify compression property in an input dataset, the copy activity read the compressed data from the source and decompress it; and when you specify the property in an output dataset, the copy activity compress then write data to the sink.

For example:

Read .zip file from FTP server, decompress it to get the files inside, and land those files in Azure Data Lake Store. You define an input FTP dataset with the compression type property as ZipDeflate.

For more details, please reference: Compression support .

Here's the tutorial about Copy data from FTP server by using Azure Data Factory .

Other format dataset To copy data from FTP in ORC/Avro/JSON/Binary format, the following properties are supported in this link: Other format dataset .

在此输入图像描述

Tips:

  1. To copy all files under a folder, specify folderPath only.
  2. To copy a single file with a given name, specify folderPath with folder part and fileName with file name.
  3. To copy a subset of files under a folder, specify folderPath with folder part and fileName with wildcard filter.

Hope this helps.

You'll need to use the filter activity to filter only the folders / files that you need. I think you'll need 2 loops:

Loop 1: get metadata of folders -> Filter required folders -> foreach pipeline with loop 2 Loop 2: get meta data of files of files -> Filter required files -> copy required files

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM