繁体   English   中英

从 Azure blob 中读取 excel 数据并使用 Python azure 函数转换为 csv

[英]Read excel data from Azure blob and convert into csv using Python azure function

我想部署具有以下功能的 azure 功能

  1. 将 excel 数据从 Azure blob 读取到流对象中,而不是下载到 VM。
  2. 读入数据框 我需要帮助才能将 excel 文件读入数据框。 如何更新放置的持有者 download_file_path 以读取 excel 数据。
    import pandas as pd 
    import os 
    import io
    from azure.storage.blob import BlobClient,BlobServiceClient,ContentSettings
        
    connectionstring="XXXXXXXXXXXXXXXX" 
    excelcontainer = "excelcontainer"        
    excelblobname="Resource.xlsx" 
    sheet ="Resource" 
            
    blob_service_client =BlobServiceClient.from_connection_string(connectionstring)
    download_file_path =os.path.join(excelcontainer)
    blob_client = blob_service_client.get_blob_client(container=excelcontainer, blob=excelblobname)
    with open(download_file_path, "rb") as f:
       data_bytes = f.read()
    df =pd.read_excel(data_bytes, sheet_name=sheet, encoding = "utf-16")

如果您想从带有熊猫的 Azure blob 中读取 excel 文件,您有两种选择

  1. 为 blob 生成 SAS 令牌,然后使用带有 SAS 令牌的 blob URL 访问它
from datetime import datetime, timedelta
import pandas as pd
from azure.storage.blob import BlobSasPermissions, generate_blob_sas
def main(req: func.HttpRequest) -> func.HttpResponse:
    account_name = 'andyprivate'
    account_key = 'h4pP1fe76*****A=='
    container_name = 'test'
    blob_name="sample.xlsx"
    sas=generate_blob_sas(
      account_name=account_name,
      container_name=container_name,
      blob_name=blob_name,
      account_key=account_key,
      permission=BlobSasPermissions(read=True),
      expiry=datetime.utcnow() + timedelta(hours=1)
    )

    blob_url = f'https://{account_name}.blob.core.windows.net/{container_name}/{blob_name}?{sas}'
    df=pd.read_excel(blob_url)
    print(df)
    ......

在此处输入图片说明

  1. 下载 blob
from azure.storage.blob import  BlobServiceClient
def main(req: func.HttpRequest) -> func.HttpResponse:
    account_name = 'andyprivate'
    account_key = 'h4pP1f****='

    blob_service_client = BlobServiceClient(account_url=f'https://{account_name }.blob.core.windows.net/', credential=account_key)
    blob_client = blob_service_client.get_blob_client(container='test', blob='sample.xlsx')
    downloader =blob_client.download_blob()
    df=pd.read_excel(downloader.readall())
    print(df)
    ....

在此处输入图片说明

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM