Using the latest azure.storage.blob (12.4.0) python library, I need to open a stream on a blob without downloading it completely in memory.
I have hdf5 files stored in storage account, using h5py (2.10.0) I need to extract some information, read data without having the file loaded in memory. The files can contains many giga bytes of data.
container_client = blob_service_client.get_container_client('sample')
blob = container_client.get_blob_client('SampleHdF5.hdf5')
stream = BytesIO()
downloader = blob.download_blob()
# download the entire file in memory here
# file can be many giga bytes! Big problem
downloader.readinto(stream)
# works fine to open the stream and read data
f = h5py.File(stream, 'r')
Maybe there's another service more appropriate for this kind of need on Azure.
get_blob_to_stream
can be used with azure.storage.blob.baseblobservice
refering to here . There are packages that I used.
from azure.storage.blob.baseblobservice import BaseBlobService
import io
connection_string = ""
container_name = ""
blob_name = ""
blob_service = BaseBlobService(connection_string=connection_string)
with io.BytesIO() as input_io:
blob_service.get_blob_to_stream(container_name=container_name, blob_name=blob_name, stream=input_io)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.