![](/img/trans.png)
[英]Azure Blob Storage with Python, create containers but not list them?
[英]Donwload a specific blob (Azure) from multiples containers with python
我只是在寻求帮助。 我对 python 很陌生,但我尝试做点什么。 我需要从多个容器中下载特定的 blob(实际上是 a.xlsx 文件)。 我的意思是,这个过程每天都会创建一个容器,但我感兴趣的是从每个容器下载一个文件,我尝试了以下方法:
# download_blobs.py
# Python program to bulk download blob files from azure storage
# Uses latest python SDK() for Azure blob storage
# Requires python 3.6 or above
import os
from azure.storage.blob import BlobServiceClient, BlobClient
from azure.storage.blob import ContentSettings, ContainerClient
# IMPORTANT: Replace connection string with your storage account connection string
# Usually starts with DefaultEndpointsProtocol=https;...
MY_CONNECTION_STRING = "my_conection_string"
# Replace with blob container
MY_BLOB_CONTAINER = "^092022"
# Replace with the local folder where you want files to be downloaded
LOCAL_BLOB_PATH = "a_local_path"
# Replace with the blob to download
BLOB_NAME = "^xlsx'"
class AzureBlobFileDownloader:
def __init__(self):
print("Intializing AzureBlobFileDownloader")
# Initialize the connection to Azure storage account
self.blob_service_client = BlobServiceClient.from_connection_string(MY_CONNECTION_STRING)
self.my_container = self.blob_service_client.get_container_client(MY_BLOB_CONTAINER)
def save_blob(self,file_name,file_content):
# Get full path to the file
download_file_path = os.path.join(LOCAL_BLOB_PATH, file_name)
# for nested blobs, create local path as well!
os.makedirs(os.path.dirname(download_file_path), exist_ok=True)
with open(download_file_path, "wb") as file:
file.write(file_content)
def download_all_blobs_in_container(self):
my_blobs = self.my_container.list_blobs(BLOB_NAME)
for blob in my_blobs:
print(blob.name)
bytes = self.my_container.get_blob_client(blob).download_blob().readall()
self.save_blob(blob.name, bytes)
# Initialize class and upload files
azure_blob_file_downloader = AzureBlobFileDownloader()
azure_blob_file_downloader.download_all_blobs_in_container()
每天创建的每个容器都有以下名称:
01092022 - 02092022 - 03092022 -。 . .
我要下载的 blob 是:
p.01092022.xlsx - p.02092022.xlsx - p.03092022.xlsx -。 . .
如何通过每个容器生成 python go 并根据它们具有的名称顺序下载与每个容器相关的文件?
谢谢你的帮助!
伟大的。
它每天只创建 1 个包含 1 个文件的容器吗? 总是遵循这种模式? (01092022 - 02092022 - 03092022)
如果你坚持命名约定,你可以使用这样的东西:
containers = blob_service_client_instance.list_containers()
for c in containers:
blob_name = 'p' + c.name + ".xlsx"
blob_client_instance = blob_service_client_instance.get_blob_client(c.name, blob_name, snapshot=None)
exists = blob_client_instance.exists()
if exists == True:
blob_data = blob_client_instance.download_blob()
data = blob_data.readall()
但这会在一段时间后开始长期运行。 您可能需要添加一行来搜索范围,例如过去 6 个月
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.