简体   繁体   中英

Azure Storage Account blob stream with Python

Using the latest azure.storage.blob (12.4.0) python library, I need to open a stream on a blob without downloading it completely in memory.

I have hdf5 files stored in storage account, using h5py (2.10.0) I need to extract some information, read data without having the file loaded in memory. The files can contains many giga bytes of data.

container_client = blob_service_client.get_container_client('sample')
blob = container_client.get_blob_client('SampleHdF5.hdf5')

stream = BytesIO()
downloader = blob.download_blob()

# download the entire file in memory here
# file can be many giga bytes! Big problem
downloader.readinto(stream)

# works fine to open the stream and read data
f = h5py.File(stream, 'r')

Maybe there's another service more appropriate for this kind of need on Azure.

get_blob_to_stream can be used with azure.storage.blob.baseblobservice refering to here . There are packages that I used.

在此处输入图像描述

from azure.storage.blob.baseblobservice import BaseBlobService
import io

connection_string = ""
container_name = ""
blob_name = ""

blob_service = BaseBlobService(connection_string=connection_string)
with io.BytesIO() as input_io:             
    blob_service.get_blob_to_stream(container_name=container_name, blob_name=blob_name, stream=input_io)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM