[英]Download Blob To Local Storage using Python
I'm trying to download a blob file & store it locally on my machine.我正在尝试下载 blob 文件并将其本地存储在我的机器上。 The file format is HDF5 (a format I have limited/no experience of so far).
文件格式是 HDF5 (我迄今为止有限/没有经验的格式)。
So far I've been successful in downloading something using the scripts below.到目前为止,我已经成功地使用下面的脚本下载了一些东西。 The key issue is it doesn't seem to be the full file.
关键问题是它似乎不是完整的文件。 When downloading the file directly from storage explorer it is circa 4,000kb.
直接从存储资源管理器下载文件时,大约为 4,000kb。 The HDF5 file I save is 2kb.
我保存的 HDF5 文件是 2kb。
What am I doing wrong?我究竟做错了什么? Am I missing a readall() somewhere?
我在某处错过了 readall() 吗?
My first time working with blob storage & HDF5's, so coming a little stuck right now.我第一次使用 blob 存储和 HDF5,所以现在有点卡住了。 A lot of the old questions seem to be using deprecated commands as the azure.storage.blob module has been updated.
由于 azure.storage.blob 模块已更新,许多旧问题似乎都在使用已弃用的命令。
from azure.storage.blob import BlobServiceClient
from io import StringIO, BytesIO
import h5py
# Initialise client
blob_service_client = BlobServiceClient.from_connection_string("my_conn_str")
# Initialise container
blob_container_client = blob_service_client.get_container_client("container_name")
# Get blob
blob_client = blob_container_client.get_blob_client("file_path")
# Download
download_stream = blob_client.download_blob()
# Create empty stream
stream = BytesIO()
# Read downloaded blob into stream
download_stream.readinto(stream)
# Create new empty hdf5 file
hf = h5py.File('data.hdf5', 'w')
# Write stream into empty HDF5
hf.create_dataset('dataset_1',stream)
# Close Blob (& save)
hf.close()
I tried to reproduce the scenario in my system facing with same issue with code you tried我试图在我的系统中重现与您尝试的代码相同的问题的场景
So I tried the another solution read the hdf5 file as stream and write it inside another hdf5 file所以我尝试了另一种解决方案,将 hdf5 文件读取为 stream 并将其写入另一个 hdf5 文件
Try with this solution.Taken some dummy data for testing purpose.尝试使用此解决方案。出于测试目的获取一些虚拟数据。
from azure.storage.blob import BlobServiceClient
from io import StringIO, BytesIO
import numpy as np
import h5py
# Initialise client
blob_service_client = BlobServiceClient.from_connection_string("Connection String")
# Initialise container
blob_container_client = blob_service_client.get_container_client("test//Container name")
# Get blob
blob_client = blob_container_client.get_blob_client("test.hdf5 //Blob name")
print("downloaded the blob ")
# Download
download_stream = blob_client.download_blob()
stream = BytesIO()
downloader = blob_client.download_blob()
# download the entire file in memory here
# file can be many giga bytes! Big problem
downloader.readinto(stream)
# works fine to open the stream and read data
f = h5py.File(stream, 'r')
//dummy data
data_matrix = np.random.uniform(-1, 1, size=(10, 3))
with h5py.File(stream, "r") as f:
# List all groups
print("Keys: %s" % f.keys())
a_group_key = list(f.keys())[0]
# Get the data
data = list(f[a_group_key])
data_matrix=data
print(data)
with h5py.File("file1.hdf5", "w") as data_file:
data_file.create_dataset("group_name", data=data_matrix)
OUTPUT OUTPUT
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.