![](/img/trans.png)
[英]How to download blobs from azure storage via node js to local storage?
[英]Download all blobs within an Azure Storage container
我已經設法編寫了一個 python 腳本來列出容器中的所有 blob。
import azure
from azure.storage.blob import BlobService
from azure.storage import *
blob_service = BlobService(account_name='<CONTAINER>', account_key='<ACCOUNT_KEY>')
blobs = []
marker = None
while True:
batch = blob_service.list_blobs('<CONAINER>', marker=marker)
blobs.extend(batch)
if not batch.next_marker:
break
marker = batch.next_marker
for blob in blobs:
print(blob.name)
就像我說的,這只列出了我要下載的 blob。 我已經轉到 Azure CLI,看看它是否可以幫助我完成我想做的事情。 我可以下載一個 blob
azure storage blob download [container]
然后它提示我指定一個 blob,我可以從 python 腳本中獲取它。 我必須下載所有這些 blob 的方法是在上面使用的命令之后將它們復制並粘貼到提示符中。 有沒有辦法我可以:
一個。 編寫一個 bash 腳本,通過執行命令來遍歷 blob 列表,然后在提示中粘貼下一個 blob 名稱。
乙。 指定在 python 腳本或 Azure CLI 中下載容器。 下載整個容器時有什么我看不到的嗎?
@gary-liu-msft 解決方案是正確的。 我對其進行了更多更改,現在代碼可以遍歷容器及其中的文件夾結構(PS - 容器中沒有文件夾,只有路徑),檢查客戶端中是否存在相同的目錄結構,如果不存在創建該目錄結構並下載這些路徑中的 blob。 它支持帶有嵌入式子目錄的長路徑。
from azure.storage.blob import BlockBlobService
from azure.storage.blob import PublicAccess
import os
#name of your storage account and the access key from Settings->AccessKeys->key1
block_blob_service = BlockBlobService(account_name='storageaccountname', account_key='accountkey')
#name of the container
generator = block_blob_service.list_blobs('testcontainer')
#code below lists all the blobs in the container and downloads them one after another
for blob in generator:
print(blob.name)
print("{}".format(blob.name))
#check if the path contains a folder structure, create the folder structure
if "/" in "{}".format(blob.name):
print("there is a path in this")
#extract the folder path and check if that folder exists locally, and if not create it
head, tail = os.path.split("{}".format(blob.name))
print(head)
print(tail)
if (os.path.isdir(os.getcwd()+ "/" + head)):
#download the files to this directory
print("directory and sub directories exist")
block_blob_service.get_blob_to_path('testcontainer',blob.name,os.getcwd()+ "/" + head + "/" + tail)
else:
#create the diretcory and download the file to it
print("directory doesn't exist, creating it now")
os.makedirs(os.getcwd()+ "/" + head, exist_ok=True)
print("directory created, download initiated")
block_blob_service.get_blob_to_path('testcontainer',blob.name,os.getcwd()+ "/" + head + "/" + tail)
else:
block_blob_service.get_blob_to_path('testcontainer',blob.name,blob.name)
此處也提供相同的代碼https://gist.github.com/brijrajsingh/35cd591c2ca90916b27742d52a3cf6ba
自 @brij-raj-singh-msft 回答以來,Microsoft 發布了適用於 Python 的 Azure Storage Blob 客戶端庫的 Gen2 版本。 (以下代碼已使用 12.5.0 版進行測試)此代碼段已於 2020 年 9 月 25 日測試
import os
from azure.storage.blob import BlobServiceClient,ContainerClient, BlobClient
import datetime
# Assuming your Azure connection string environment variable set.
# If not, create BlobServiceClient using trl & credentials.
#Example: https://docs.microsoft.com/en-us/python/api/azure-storage-blob/azure.storage.blob.blobserviceclient
connection_string = os.getenv("AZURE_STORAGE_CONNECTION_STRING")
blob_service_client = BlobServiceClient.from_connection_string(conn_str=connection_string)
# create container client
container_name = 'test2'
container_client = blob_service_client.get_container_client(container_name)
#Check if there is a top level local folder exist for container.
#If not, create one
data_dir ='Z:/azure_storage'
data_dir = data_dir+ "/" + container_name
if not(os.path.isdir(data_dir)):
print("[{}]:[INFO] : Creating local directory for container".format(datetime.datetime.utcnow()))
os.makedirs(data_dir, exist_ok=True)
#code below lists all the blobs in the container and downloads them one after another
blob_list = container_client.list_blobs()
for blob in blob_list:
print("[{}]:[INFO] : Blob name: {}".format(datetime.datetime.utcnow(), blob.name))
#check if the path contains a folder structure, create the folder structure
if "/" in "{}".format(blob.name):
#extract the folder path and check if that folder exists locally, and if not create it
head, tail = os.path.split("{}".format(blob.name))
if not (os.path.isdir(data_dir+ "/" + head)):
#create the diretcory and download the file to it
print("[{}]:[INFO] : {} directory doesn't exist, creating it now".format(datetime.datetime.utcnow(),data_dir+ "/" + head))
os.makedirs(data_dir+ "/" + head, exist_ok=True)
# Finally, download the blob
blob_client = container_client.get_blob_client(blob.name)
dowlload_blob(blob_client,data_dir+ "/"+blob.name)
def dowlload_blob(blob_client, destination_file):
print("[{}]:[INFO] : Downloading {} ...".format(datetime.datetime.utcnow(),destination_file))
with open(destination_file, "wb") as my_blob:
blob_data = blob_client.download_blob()
blob_data.readinto(my_blob)
print("[{}]:[INFO] : download finished".format(datetime.datetime.utcnow()))
此處也提供相同的代碼https://gist.github.com/allene/6bbb36ec3ed08b419206156567290b13
目前,我們似乎無法使用單個 API 從容器中直接下載所有 blob。 我們可以在https://msdn.microsoft.com/en-us/library/azure/dd179377.aspx獲得所有可用的 blob 操作。
所以我們可以列出 blob 的ListGenerator
,然后循環下載 blob。 EG:
result = blob_service.list_blobs(container)
for b in result.items:
r = blob_service.get_blob_to_path(container,b.name,"folder/{}".format(b.name))
使用azure-storage-python時導入 blockblob 服務:
from azure.storage.blob import BlockBlobService
我為 Azure CLI 制作了一個Python 包裝器,它使我們能夠批量下載/上傳。 這樣我們就可以下載一個完整的容器或從容器中下載某些文件。
安裝:
pip install azurebatchload
import os
from azurebatchload.download import DownloadBatch
az_batch = DownloadBatch(
destination='../pdfs',
source='blobcontainername',
pattern='*.pdf'
)
az_batch.download()
這是一個簡單的腳本 (PowerShell),它將遍歷單個容器並將其中的所有內容下載到您提供的 $Destination
# # Download All Blobs in a Container # # Connect to Azure Account Connect-AzAccount # Set Variables $Destination="C:\Software" $ResourceGroupName = 'resource group name' $ContainerName = 'container name' $storageAccName = 'storage account name' # Function to download all blob contents Function DownloadBlobContents { Write-Host -ForegroundColor Green "Download blob contents from storage container.." # Get the storage account $StorageAcc = Get-AzStorageAccount -ResourceGroupName $resourceGroupName -Name $storageAccName # Get the storage account context $Ctx = $StorageAcc.Context # Get all containers $Containers = Get-AzStorageContainer -Context $Ctx Write-Host -ForegroundColor Magenta $Container.Name "Creating or checking folder presence" # Create or check folder presence New-Item -ItemType Directory -Path $Destination -Force # Get the blob contents from the container $BlobConents = Get-AzStorageBlob -Container $ContainerName -Context $Ctx # Loop through each blob and download each one until they are all complete foreach($BlobConent in $BlobConents) { # Download the blob content Get-AzStorageBlobContent -Container $ContainerName -Context $Ctx -Blob $BlobConent.Name -Destination $Destination -Force Write-Host -ForegroundColor Green "Downloaded a blob content" } } DownloadBlobContents
我想學習的是不必使用 Connect-AzAccount 而是調用存儲帳戶的密鑰,因此我可以運行它而無需在
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.