简体   繁体   中英

Iterate the files in an ADLS2 Azure Datalake Directory given a SAS url

I'd like to download the files from a ADLS2 Storage blob directory - I have only a SAS url to the said directory, and I would like to recursively download all the files in that directory.

It is very clear how to do this given the storage credentials, and there are many examples that show how to do it - but I couldn't find any which uses a SAS url.

Any clues or documentation links would be much appreciated!

I have reproduced in my environment, and I got expected results as below and I have taken code from @ROGER ZANDER's Blog :

function DownloadBlob {
    param (
        [Parameter(Mandatory)]
        [string]$URL,
        [string]$Path = (Get-Location)
    )
    
    $uri = $URL.split('?')[0]
    $sas = $URL.split('?')[1]
    $newurl = $uri + "?restype=container&comp=list&" + $sas 
    $body = Invoke-RestMethod -uri $newurl 
    $xml = [xml]$body.Substring($body.IndexOf('<')) 
    $files = $xml.ChildNodes.Blobs.Blob.Name
    $files | ForEach-Object { $_; New-Item (Join-Path $Path (Split-Path $_)) -ItemType Directory -ea SilentlyContinue | Out-Null
        (New-Object System.Net.WebClient).DownloadFile($uri + "/" + $_ + "?" + $sas, (Join-Path $Path $_))
     }
}

Then call DownloadBlob Function and Give SAS URL.

Output:

在此处输入图像描述

In Local Machine Downloaded File:

在此处输入图像描述

Use: https://learn.microsoft.com/en-us/dotnet/api/azure.storage.files.datalake.datalakefileclient?view=azure-dotnet

I don't know if it exists an method for downloading a directory from blob storage. But you can create a download folder and download all the files in the directory by a loop. It's a few steps:

Create a service client using "Datalakeserviceclient" to get access to datalake using SAS Use: DataLakeFileClient(Uri, AzureSasCredential) to create client.

Then to access the container use: DataLakeFileSystemClient

fileSystem = CreateFileSystem(client, _containerName)

use DataLakeDirectoryClient directoryClient = fileSystem.GetDirectoryClient(directoryName); to get the directory

To loop through the items in the directory use the loop below:

foreach (PathItem pathItem in directoryClient.GetPaths())
        {
            int pos = pathItem.Name.LastIndexOf("/") + 1;
            DataLakeFileClient fileClient = directoryClient.GetFileClient(pathItem.Name.Substring(pos, pathItem.Name.Length - pos));

            await fileClient.ReadToAsync(downloadpath + @"\" + pathItem.Name);

        }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM