简体   繁体   中英

Access S3 files from Azure Synapse Notebook

Goal: Move a lot of files from AWS S3 to ADLS Gen2 using Azure Synapse as fast as possible using parameterized regex expression for filename pattern using Synapse Notebook.

What I tried so far:

  1. I know to access ADLS gen2, we can use mssparkutils.fs.ls('abfss://container_name@storage_account_name.blob.core.windows.net/foldername') works but what is the equivalent to access S3 ?
  2. I used mssparkutils.credentials.getsecret('AKV name','secretname') and mssparkutils.credentials.getsecret('AKV name','secret key id') to fetch secret details in the Synapse notebook but unable configure S3 to Synapse.

Question: Do I have to use the existing linked service using the credentials.getFullConnectionString(LinkedService) API ? In short, my question is, How do I configure connectivity to S3 from within Synapse Notebook?

Answering my question here. AzCopy worked.Below is the link which helped me finish the task. The steps are as follows.

  1. Install AzCopy on your machine.
  2. Goto your terminal and goto the directory where the executeable is installed; run "AzCopy Login"; use Azure Active Directory credentials in your browser using the link from terminal message..Use the CODE provided in the terminal.
  3. Authorize with S3 using below set AWS_ACCESS_KEY_ID= set AWS_SECRET_ACCESS_KEY=
  4. For ADLS Gen2, you are already done in step-2
  5. Use the commands (which ever suits your need) from the link below.

https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10

https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-s3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM