Imagine that I have a storage account with a blob container, which get files uploaded eventually. I want to process each file that reaches on the blob storage, open it, extract and store information. Definitively a expensive operation that could fit in a Durable Functions scenario.
Here's the trigger:
[FunctionName("PayrollFileTrigger")]
public static async Task Start(
[BlobTrigger("files/{name}", Connection = "AzureWebJobsStorage")]Stream myBlob, string name,
[DurableClient] IDurableOrchestrationClient starter,
ILogger log)
{
string instanceId = await starter.StartNewAsync("PayrollFile_StartFunction", "payroll_file", name);
}
...which calls the orchestration:
[FunctionName("PayrollFile_StartFunction")]
public async static Task<IActionResult> Run(
[OrchestrationTrigger] IDurableOrchestrationContext context, string blobName,
ExecutionContext executionContext, ILogger log)
{
//Downloads the blob
string filePath =
await context.CallActivityWithRetryAsync<string>("DownloadPayrollBlob", options, blobName);
if (filePath == null) return ErrorResult(ERROR_MSG_1, log);
//Extract data
var payroll =
await context.CallActivityWithRetryAsync<Payroll>("ExtractBlobData", options, filePath);
... and so on (just a sample here) ...
}
But there is a problem. While testing this error occurs, meaning, I think, that I can't start another orchestration with the same id:
An Orchestration instance with the status Pending already exists.
1 - So if I push many files to the container which the trigger is "listening", in a short period of time, the orchestration will get busy with one of them and will ignore other further events?
2 - When the orchestration will get rid of pending
status? It occurs automatically?
3 - Should I create a new orchestration instance for each file to be processed? I know you can omit the instanceId
parameter, so it get generated randomly and never conflicts with one already started. But, is it safe to do? How do I manage them and ensure they will get finished sometime?
string instanceId = await starter.StartNewAsync("PayrollFile_StartFunction", "payroll_file", name);
The second argument is the instanceId, which is required to be unique .
Instead, try:
string instanceId = await starter.StartNewAsync("PayrollFile_StartFunction", input: name);
Depending on what you want you might want to have only 1 durable instance per file. Microsoft state that you should
Use a random identifier for the instance ID. Random instance IDs help ensure an equal load distribution when you're scaling orchestrator functions across multiple VMs. The proper time to use non-random instance IDs is when the ID must come from an external source, or when you're implementing the singleton orchestrator pattern.
In your specific case I'd say you can go without supplying the instanceId
yourself and perhaps log the generated instanceId
or write it in a storage solution alongside information about the file that started the orchestration.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.