[英]How to read docx files from azure blob using Python
How to read docx files from azure blob using Python?如何使用 Python 从 azure blob 读取 docx 文件? I use the following code, but finally, blob_content has all unreadable characters.我使用以下代码,但最后,blob_content 包含所有不可读的字符。 This code works fine for txt files but not for MS Word Documents (*.docx).此代码适用于 txt 文件,但不适用于 MS Word 文档 (*.docx)。
Please help if you have any solution.如果您有任何解决方案,请提供帮助。
blob_service_client_instance = BlobServiceClient(account_url=STORAGEACCOUNTURL, credential=STORAGEACCOUNTKEY)
blob_client_instance = blob_service_client_instance.get_blob_client(container_name, blob_name, snapshot=None)
blob_download = blob_client_instance.download_blob()
blob_content = blob_download.readall().decode('utf-8')
I tried in my environment and got below results:我在我的环境中尝试并得到以下结果:
Initially I tried the piece of code to read the docx file from azure blob storage through visual studio code.最初我尝试这段代码通过 visual studio 代码从 azure blob 存储中读取 docx 文件。
In portal, I have a docx file in azure blob storage在门户中,我在 azure blob 存储中有一个 docx 文件
from azure.storage.blob import BlobServiceClient
client=BlobServiceClient.from_connection_string("<Connection string>")
serviceclient = client.get_container_client("test")
bc = serviceclient.get_blob_client(blob="sample.docx")
with open("sample.docx", 'wb') as file:
data = bc.download_blob()
file.write(data.readall())
The above code worked and downloaded the docx file from azure blob storage.上面的代码有效并从 azure blob 存储下载了 docx 文件。 when I try to open the file it is source code editor not in docx code editor.当我尝试打开文件时,它是源代码编辑器而不是 docx 代码编辑器。
Console:安慰:
After I used piece of code to read a docx file from which is downloaded from azure blob Storage.在我使用一段代码读取从 azure blob Storage 下载的 docx 文件之后。
Code:代码:
import docx
doc = docx.Document("<path of the downloaded file >")
all_paras = doc.paragraphs
for para in all_paras:
print(para.text)
Console: After I executed the above code, I am able to read the docx file successfully. Console:执行上面的代码后,我能够成功读取docx文件。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.