[英]How to read xlsx blob into pandas from Azure function in python
我正在从 azure function 中的 blob 中读取 in.xslx 数据。 我的代码看起来像这样:
def main(techdatablob: func.InputStream, crmdatablob: func.InputStream, outputblob: func.Out[func.InputStream]):
# Load in the tech and crm data
crm_data = pd.read_excel(crmdatablob.read().decode('ISO-8859-1'))
tech_data = pd.read_excel(techdatablob.read().decode('ISO-8859-1'))
问题是当我尝试解码文件时,出现以下错误:
ValueError: Protocol not known: PK...
而且“……”后面还有很多奇怪的字符。 关于如何正确读取这些文件的任何想法?
请参考我的代码,好像不需要加decode('ISO-8859-1')
:
import logging
import pandas as pd
import azure.functions as func
def main(techdatablob: func.InputStream, crmdatablob: func.InputStream, outputblob: func.Out[func.InputStream]):
logging.info(f"Python blob trigger function processed blob \n"
f"Name: {techdatablob.name}\n"
f"Blob Size: {techdatablob.length} bytes")
# Load in the tech and crm data
crm_data = pd.read_excel(crmdatablob.read())
logging.info(f"{crm_data}")
tech_data = pd.read_excel(techdatablob.read())
logging.info(f"{tech_data}")
注意:您的function.json
应如下所示。 否则会出现错误。
{
"name": "techdatablob",
"type": "blobTrigger",
"direction": "in",
"path": "path1/{name}",
"connection": "example"
},
{
"name": "crmdatablob",
"dataType": "binary",
"type": "blob",
"direction": "in",
"path": "path2/data.xlsx",
"connection": "example"
},
{
"name": "outputblob",
"type": "blob",
"direction": "out",
"path": "path3/out.xlsx",
"connection": "example"
}
这与您的function.json
之间的区别在于您缺少dataType
属性。
我的测试结果是这样的,似乎没有问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.