简体   繁体   English

在给定 SAS url 的 ADLS2 Azure Datalake 目录中迭代文件

[英]Iterate the files in an ADLS2 Azure Datalake Directory given a SAS url

I'd like to download the files from a ADLS2 Storage blob directory - I have only a SAS url to the said directory, and I would like to recursively download all the files in that directory.我想从 ADLS2 存储 blob 目录下载文件 - 我只有一个 SAS url 到所述目录,我想递归下载该目录中的所有文件。

It is very clear how to do this given the storage credentials, and there are many examples that show how to do it - but I couldn't find any which uses a SAS url.很清楚如何在给定存储凭据的情况下执行此操作,并且有很多示例说明如何执行此操作 - 但我找不到任何使用 SAS url 的示例。

Any clues or documentation links would be much appreciated!任何线索或文档链接将不胜感激!

I have reproduced in my environment, and I got expected results as below and I have taken code from @ROGER ZANDER's Blog :我在我的环境中进行了重现,得到了如下预期结果,并且我从@ROGER ZANDER 的博客中获取了代码:

function DownloadBlob {
    param (
        [Parameter(Mandatory)]
        [string]$URL,
        [string]$Path = (Get-Location)
    )
    
    $uri = $URL.split('?')[0]
    $sas = $URL.split('?')[1]
    $newurl = $uri + "?restype=container&comp=list&" + $sas 
    $body = Invoke-RestMethod -uri $newurl 
    $xml = [xml]$body.Substring($body.IndexOf('<')) 
    $files = $xml.ChildNodes.Blobs.Blob.Name
    $files | ForEach-Object { $_; New-Item (Join-Path $Path (Split-Path $_)) -ItemType Directory -ea SilentlyContinue | Out-Null
        (New-Object System.Net.WebClient).DownloadFile($uri + "/" + $_ + "?" + $sas, (Join-Path $Path $_))
     }
}

Then call DownloadBlob Function and Give SAS URL.然后调用 DownloadBlob 函数并提供 SAS URL。

Output:输出:

在此处输入图像描述

In Local Machine Downloaded File:在本地机器下载的文件中:

在此处输入图像描述

Use: https://learn.microsoft.com/en-us/dotnet/api/azure.storage.files.datalake.datalakefileclient?view=azure-dotnet使用: https ://learn.microsoft.com/en-us/dotnet/api/azure.storage.files.datalake.datalakefileclient?view=azure-dotnet

I don't know if it exists an method for downloading a directory from blob storage.我不知道它是否存在从 blob 存储下载目录的方法。 But you can create a download folder and download all the files in the directory by a loop.但是可以创建一个下载文件夹,循环下载目录下的所有文件。 It's a few steps:这是几个步骤:

Create a service client using "Datalakeserviceclient" to get access to datalake using SAS Use: DataLakeFileClient(Uri, AzureSasCredential) to create client.使用“Datalakeserviceclient”创建服务客户端以使用 SAS 访问 datalake 使用: DataLakeFileClient(Uri, AzureSasCredential)创建客户端。

Then to access the container use: DataLakeFileSystemClient然后访问容器使用:DataLakeFileSystemClient

fileSystem = CreateFileSystem(client, _containerName)

use DataLakeDirectoryClient directoryClient = fileSystem.GetDirectoryClient(directoryName);使用DataLakeDirectoryClient directoryClient = fileSystem.GetDirectoryClient(directoryName); to get the directory获取目录

To loop through the items in the directory use the loop below:要遍历目录中的项目,请使用以下循环:

foreach (PathItem pathItem in directoryClient.GetPaths())
        {
            int pos = pathItem.Name.LastIndexOf("/") + 1;
            DataLakeFileClient fileClient = directoryClient.GetFileClient(pathItem.Name.Substring(pos, pathItem.Name.Length - pos));

            await fileClient.ReadToAsync(downloadpath + @"\" + pathItem.Name);

        }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Databricks + ADF + ADLS2 + Hive = Azure Synapse - Databricks + ADF + ADLS2 + Hive = Azure Synapse Azure databricks 集群无法访问已安装的 adls2 - Azure databricks cluster don't have acces to mounted adls2 Azure DataLake(ADLS)批量下载错误请求 - Azure DataLake (ADLS) BulkDownload Bad Request Azure 数据湖一代。 2 (adls2), api 获取存储在数据湖中的数据的总体大小 - Azure data lake gen. 2 (adls2), api to get overall size of data stored in data lake 从 ADLS2 转移到计算目标非常慢 Azure 机器学习 - Transfer from ADLS2 to Compute Target very slow Azure Machine Learning 在Azure Datalake中附加CSV文件 - Appending csv files in Azure datalake 如何在 adls2 中找到容器的超级用户 - How to find the superuser for a container in adls2 Can I create a SAS url to access all files and folders of a directory in Azure File share (ie not Blobs) in C#? - Can I create a SAS url to access all files and folders of a directory in Azure File share (i.e. not Blobs) in C#? 在没有 Azure DataFactory 的情况下将文件和文件夹从 Azure DataLake Gen1 复制到 Azure DataLake Gen2 - Copy files and folders from Azure DataLake Gen1 to Azure DataLake Gen2 without Azure DataFactory 使用来宾用户凭据从 PowerBI 服务访问 ADLS2 - Access ADLS2 from PowerBI service with Guest user credentials
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM