[英]Azure Functions - Python (Blob Trigger and Binding)
I have reviewed the documentation provided by Microsoft on Triggers.我已经查看了 Microsoft 提供的关于触发器的文档。 [https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob-trigger?tabs=python][1]
[https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob-trigger?tabs=python][1]
Indeed, using the func.InputStream
parameter in the Azure Function allows us to retrieve the blob and some properties ( name, uri, length
) we can also read the bytes using the read()
function, but how do we transform the bytes into an object we can manipulate such as a Pandas dataframe (or any other types of object for other types of files ie jpg)? Indeed, using the
func.InputStream
parameter in the Azure Function allows us to retrieve the blob and some properties ( name, uri, length
) we can also read the bytes using the read()
function, but how do we transform the bytes into an我们可以操作 object,例如 Pandas dataframe(或任何其他类型的 ZA8CFDE6331BD59EB2AC96F89111C4 文件)
My host.json file can be found below:我的 host.json 文件可以在下面找到:
{
"scriptFile": "__init__.py",
"bindings": [
{
"name": "myblob",
"type": "blobTrigger",
"direction": "in",
"path": "statscan/raw/ncdb/{name}",
"connection": ""
},
{
"type": "blob",
"direction": "out",
"name": "outputBlob",
"path": "statscan/enriched/func/{name}.csv",
"connection": ""
}
]
}
The Blob Trigger function can be found here below: Blob 触发器 function 可以在下面找到:
import pandas as pd
import logging
import azure.functions as func
def main(myblob: func.InputStream, outputBlob: func.Out[str]):
logging.info(f"Blob trigger executed!")
logging.info(f"Blob Name: {myblob.name} ({myblob.length}) bytes")
logging.info(f"Full Blob URI: {myblob.uri}")
### Manipulate with Pandas ###
### Output ###
output = ''
outputBlob.set(output)
We have multiple ways to check the file content and read it accordingly, in your case lets consider as csv format as blob.我们有多种方法来检查文件内容并相应地读取它,在您的情况下,让我们将 csv 格式视为 blob。
To achieve this we can download the blob and then read the data to dataframe, below is the way I tried this from MS Docs( https://docs.microsoft.com/en-us/azure/architecture/data-science-process/explore-data-blob ):为此,我们可以下载 blob,然后将数据读取到 dataframe,以下是我从 MS Docs 尝试的方法( https://docs.microsoft.com/en-us/azure/architecture/data-science-process /探索数据块):
from azure.storage.blob import BlobServiceClient import pandas as pd STORAGEACCOUNTURL= <storage_account_url> STORAGEACCOUNTKEY= <storage_account_key> LOCALFILENAME= <local_file_name> CONTAINERNAME= <container_name> BLOBNAME= <blob_name> #download from blob t1=time.time() blob_service_client_instance = BlobServiceClient(account_url=STORAGEACCOUNTURL, credential=STORAGEACCOUNTKEY) blob_client_instance = blob_service_client_instance.get_blob_client(CONTAINERNAME, BLOBNAME, snapshot=None) with open(LOCALFILENAME, "wb") as my_blob: blob_data = blob_client_instance.download_blob() blob_data.readinto(my_blob) t2=time.time() print(("It takes %s seconds to download "+BLOBNAME) % (t2 - t1))
Else we can convert is as below directly with blob sas url:否则我们可以直接用blob sas url转换如下:
from io import StringIO blobstring = blob_service.get_blob_to_text(CONTAINERNAME,BLOBNAME).content df = pd.read_csv(StringIO(blobstring))
The other way is with blob sas url, we can get it after right clicking on the blob and opting for "Generate SAS":另一种方法是使用blob sas url,我们可以在右键单击blob并选择“Generate SAS”后获取它:
import pandas as pd data = pd.read_csv('blob_sas_url')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.