简体   繁体   English

如何从 Azure Functions 中的存储容器读取多个文件

[英]How to read multiple files from a storage container in Azure Functions

I have an Azure Functions application (Python) where I have to read multiple CSV files that are stored in an Azure Storage Account (StorageV2) to validate them.我有一个 Azure Functions 应用程序 (Python),我必须在其中读取存储在 Azure 存储帐户 (StorageV2) 中的多个 CSV 文件来验证它们。

However, the filenames and amount of CSV files in this folder change over time.但是,此文件夹中的文件名和 CSV 文件数量会随着时间而变化。 The application is triggered using an HTTP binding and it would be ideal to dynamically check for the contents of the folder and then sequentially process all the CSV files in the folder.该应用程序使用 HTTP 绑定触发,最好动态检查文件夹的内容,然后按顺序处理文件夹中的所有 CSV 文件。

From the documentation it seems that Azure Functions uses bindings for in- and output, however, the examples only show (multiple) input bindings that point to a single file, and not a folder/container of any kind.从文档来看,Azure Functions 似乎对输入和输出使用绑定,但是,示例仅显示(多个)指向单个文件的输入绑定,而不是任何类型的文件夹/容器。 Because I do not know the amount of files and the file names beforehand, this would be difficult to implement.因为我事先不知道文件的数量和文件名,这将很难实现。

Eg: function.json例如:function.json

{
  "bindings": [
    {
      "authLevel": "function",
      "type": "httpTrigger",
      "direction": "in",
      "name": "req",
      "methods": [
        "get",
        "post"
      ]
    },
    {
      "name": "inputcsv",
      "type": "blob",
      "dataType": "binary",
      "path": "samplesCSVs/{singleCSVfile}",
      "connection": "MyStorageConnectionAppSetting",
      "direction": "in"
    },
    {
      "type": "http",
      "direction": "out",
      "name": "$return"
    }
  ]
  "scriptFile": "__init__.py"
}

Is it possible to point to a folder here?是否可以在此处指向文件夹? Or dynamically read the files in a Storage Account in another way?或者以另一种方式动态读取存储帐户中的文件?

The only other alternative that I can think of is to simply zip all the CSV files in advance, so I can use one input binding to this zipped file and then unpack them in a temporary folder to process them, but it would be less efficient.我能想到的唯一另一种选择是简单地提前压缩所有 CSV 文件,这样我就可以使用一个输入绑定到这个压缩文件,然后将它们解压到一个临时文件夹中来处理它们,但这会降低效率。

Sources:资料来源:

https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob-input?tabs=python https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob-input?tabs=python

https://docs.microsoft.com/en-us/azure/azure-functions/functions-add-output-binding-storage-queue-vs-code?tabs=in-process&pivots=programming-language-python https://docs.microsoft.com/en-us/azure/azure-functions/functions-add-output-binding-storage-queue-vs-code?tabs=in-process&pivots=programming-language-python

Using Azure Blob Trigger you can only match one-to-one, a change or creation of a new blob, will trigger the execution of a function.使用 Azure Blob 触发器,您只能一对一匹配,新 Blob 的更改或创建将触发函数的执行。

You can use Event Grid and filter events at the container level, and use an Azure Function to handle that particular event:您可以在容器级别使用事件网格和筛选事件,并使用 Azure 函数来处理该特定事件:

https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-event-overview https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-event-overview

It seems I had a misunderstanding about how Azure Functions works.我似乎对 Azure Functions 的工作方式有误解。 Because it is still Python code and Azure has a Python SDK available to connect to a Storage account and manipulate files, this is the best way to achieve the task that I was trying to accomplish.因为它仍然是 Python 代码,而且 Azure 有一个 Python SDK 可用于连接到存储帐户和操作文件,所以这是实现我试图完成的任务的最佳方式。

The input/output bindings of Azure Functions is only helpful when using specific triggers it seems, but this was not required for my problem. Azure Functions 的输入/输出绑定似乎仅在使用特定触发器时才有帮助,但这不是我的问题所必需的。

Thanks to zolty13 for pointing me in the right direction.感谢 zolty13 为我指明了正确的方向。

Source:来源:

https://docs.microsoft.com/en-us/python/api/overview/azure/storage-blob-readme?view=azure-python https://docs.microsoft.com/en-us/python/api/overview/azure/storage-blob-readme?view=azure-python

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 python azure 函数从 azure blob 存储读取文件 - Read files from azure blob storage using python azure functions 我们如何使用 python 从存储容器 azure 读取文件 - How can we read files from storage container azure using python 是否可以从 Azure Blob Storage 容器中读取所有文件,并在使用 Python 读取后删除文件? - Is it possible to read in all the files from an Azure Blob Storage container, and deleting the files after reading with Python? 如何从 Python 中的 Azure 存储资源管理器中读取文件作为函数? - How to read files from Azure storage explorer in Python as a function? 如何使用 Python 和 Azure 函数在 Azure 存储容器中创建 blob - How to create a blob in an Azure Storage Container using Python & Azure Functions 如何使用 Azure 函数从 Blob 存储中读取 json 文件 Python - How to read json file from blob storage using Azure Functions Blob Trigger with Python 如何使用 Python 从给定 SAS URI 和容器名称的 Azure Blob 存储下载文件列表? - How to download a list of files from Azure Blob Storage given SAS URI and container name using Python? 如何在 Azure 存储容器中创建目录而不创建额外文件? - How to create directories in Azure storage container without creating extra files? 如何从 memory 中的 azure 存储中读取大型 gzip csv 文件 Z945F3FC4429568A73B6ZF5? - How to read large gzip csv files from azure storage in memory in aws lambda? 从 azure blob 并行读取多个文件 - Read multiple files in parellel from an azure blob
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM