简体   繁体   English

如何使用 Python 从 azure blob 读取 docx 文件

[英]How to read docx files from azure blob using Python

How to read docx files from azure blob using Python?如何使用 Python 从 azure blob 读取 docx 文件? I use the following code, but finally, blob_content has all unreadable characters.我使用以下代码,但最后,blob_content 包含所有不可读的字符。 This code works fine for txt files but not for MS Word Documents (*.docx).此代码适用于 txt 文件,但不适用于 MS Word 文档 (*.docx)。

Please help if you have any solution.如果您有任何解决方案,请提供帮助。

blob_service_client_instance = BlobServiceClient(account_url=STORAGEACCOUNTURL, credential=STORAGEACCOUNTKEY)
blob_client_instance = blob_service_client_instance.get_blob_client(container_name, blob_name, snapshot=None)
blob_download = blob_client_instance.download_blob()
blob_content = blob_download.readall().decode('utf-8')

I tried in my environment and got below results:我在我的环境中尝试并得到以下结果:

Initially I tried the piece of code to read the docx file from azure blob storage through visual studio code.最初我尝试这段代码通过 visual studio 代码从 azure blob 存储中读取 docx 文件。

In portal, I have a docx file in azure blob storage在门户中,我在 azure blob 存储中有一个 docx 文件

在此处输入图像描述

from  azure.storage.blob  import  BlobServiceClient

client=BlobServiceClient.from_connection_string("<Connection string>")
serviceclient = client.get_container_client("test")
bc = serviceclient.get_blob_client(blob="sample.docx")
   with open("sample.docx", 'wb') as file:
data = bc.download_blob()
file.write(data.readall())

The above code worked and downloaded the docx file from azure blob storage.上面的代码有效并从 azure blob 存储下载了 docx 文件。 when I try to open the file it is source code editor not in docx code editor.当我尝试打开文件时,它是源代码编辑器而不是 docx 代码编辑器。

Console:安慰:

在此处输入图像描述

After I used piece of code to read a docx file from which is downloaded from azure blob Storage.在我使用一段代码读取从 azure blob Storage 下载的 docx 文件之后。

Code:代码:

import  docx
doc = docx.Document("<path of the downloaded file >")
all_paras = doc.paragraphs
for  para  in  all_paras:
print(para.text)

Console: After I executed the above code, I am able to read the docx file successfully. Console:执行上面的代码后,我能够成功读取docx文件。

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 python azure 函数从 azure blob 存储读取文件 - Read files from azure blob storage using python azure functions 使用 python 从 Azure blob 读取 Json 文件? - Read Json files from Azure blob using python? 如何使用 Python 从 Azure Blob 容器读取文件 - How to read a file from Azure Blob Container using Python 如何使用 python 从 MySQL 下载 BLOB .docx 文件? - How to download BLOB .docx file from MySQL using python? 如何使用 Azure 函数从 Blob 存储中读取 json 文件 Python - How to read json file from blob storage using Azure Functions Blob Trigger with Python Azure Blob - 使用 Python 读取 - Azure Blob - Read using Python 使用 python 读取 azure blob - Read in azure blob using python 如何使用 PowerShell 或 python 脚本读取然后编辑或附加存储在 Azure Blob 存储中的 Excel 文件(列和行) - How to read then edit or append an Excel Files (columns and rows ) stored in Azure Blob Storage using PowerShell or python script dask:如何从Microsoft Azure Blob将CSV文件读入DataFrame - dask : How to read CSV files into a DataFrame from Microsoft Azure Blob 使用 python notebook 从 Azure Blob 读取数据到内存中 - Read data from Azure Blob to In-memory using python notebook
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM