简体   繁体   中英

Azure Functions: How to manage Durable Functions with Blob Triggers?

Imagine that I have a storage account with a blob container, which get files uploaded eventually. I want to process each file that reaches on the blob storage, open it, extract and store information. Definitively a expensive operation that could fit in a Durable Functions scenario.

Here's the trigger:

        [FunctionName("PayrollFileTrigger")]
        public static async Task Start(

         [BlobTrigger("files/{name}", Connection = "AzureWebJobsStorage")]Stream myBlob, string name,
         [DurableClient] IDurableOrchestrationClient starter,
         ILogger log)
        {

            string instanceId = await starter.StartNewAsync("PayrollFile_StartFunction", "payroll_file", name);

        }

...which calls the orchestration:


        [FunctionName("PayrollFile_StartFunction")]
        public async static Task<IActionResult> Run(
            [OrchestrationTrigger] IDurableOrchestrationContext context, string blobName, 

            ExecutionContext executionContext, ILogger log)
        {

            //Downloads the blob
            string filePath = 
                await context.CallActivityWithRetryAsync<string>("DownloadPayrollBlob", options, blobName);

            if (filePath == null) return ErrorResult(ERROR_MSG_1, log);

            //Extract data
            var payroll = 
                await context.CallActivityWithRetryAsync<Payroll>("ExtractBlobData", options, filePath);

           ... and so on (just a sample here) ...
         }

But there is a problem. While testing this error occurs, meaning, I think, that I can't start another orchestration with the same id:

An Orchestration instance with the status Pending already exists.



1 - So if I push many files to the container which the trigger is "listening", in a short period of time, the orchestration will get busy with one of them and will ignore other further events?

2 - When the orchestration will get rid of pending status? It occurs automatically?

3 - Should I create a new orchestration instance for each file to be processed? I know you can omit the instanceId parameter, so it get generated randomly and never conflicts with one already started. But, is it safe to do? How do I manage them and ensure they will get finished sometime?

string instanceId = await starter.StartNewAsync("PayrollFile_StartFunction", "payroll_file", name);

The second argument is the instanceId, which is required to be unique .

Instead, try:

string instanceId = await starter.StartNewAsync("PayrollFile_StartFunction", input: name);

Depending on what you want you might want to have only 1 durable instance per file. Microsoft state that you should

Use a random identifier for the instance ID. Random instance IDs help ensure an equal load distribution when you're scaling orchestrator functions across multiple VMs. The proper time to use non-random instance IDs is when the ID must come from an external source, or when you're implementing the singleton orchestrator pattern.

https://docs.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-instance-management?tabs=csharp#start-instances

In your specific case I'd say you can go without supplying the instanceId yourself and perhaps log the generated instanceId or write it in a storage solution alongside information about the file that started the orchestration.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM