简体   繁体   中英

Azure Durable Functions' CallActivityWithRetryAsync does not retry on failure

In my orchestrator function, I upload a request file to an external server. After some non-deterministic amount of time, a response file should have been generated. I need to poll for this file and download it.

My current approach is to wait 10mins after upload. And then use the built-in CallActivityWithRetryAsync with RetryOptions . After the first poll/download failure, wait 5mins before starting a total of 10 retry attempts. A retry should only be attempted if an exception with the message RESPONSE_FILE_NOT_YET_AVAILABLE is thrown within the activity function.

        var nowIn10Minutes = ctx.CurrentUtcDateTime.AddMinutes(10);
        await ctx.CreateTimer(nowIn10Minutes, CancellationToken.None);

        const string RETRY_ERROR_MESSAGE = "RESPONSE_FILE_NOT_YET_AVAILABLE";
        var retryOptions = new RetryOptions(TimeSpan.FromMinutes(5), 10)
        {
            Handle = ex => ex.Message == RETRY_ERROR_MESSAGE
        };
        await ctx.CallActivityWithRetryAsync(nameof(PollForResponseAndDownload), retryOptions, input);

However, according to the logs, this retry logic is not honoured. See below.

After waiting 10mins as set in the timer, the orchestration immediately fails with a FunctionFailedException. No retry is executed, though the correct exception message is shown in the logs.

Am I fundamentally misunderstanding the process? Here are the relevant logs:

-> After uploading the request, wait 10mins

2022-01-31 00:00:06.740 <GUID>: Function 'MyOrchestrator (Orchestrator)' is waiting for input. Reason: CreateTimer:2022-01-31T00:10:06.5093237Z. IsReplay: False. State: Listening. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 112.
2022-01-31 00:00:06.741 <GUID>: Function 'MyOrchestrator (Orchestrator)' awaited. IsReplay: False. State: Awaited. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 113.

-> Resume after 10mins, schedule activity function for execution

2022-01-31 00:10:32.700 <GUID>: Function 'MyOrchestrator (Orchestrator)' was resumed by a timer scheduled for '2022-01-31T00:10:06.5093237Z'. IsReplay: False. State: TimerExpired. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 114.
2022-01-31 00:10:32.701 <GUID>: Function 'PollForResponseAndDownload (Activity)' scheduled. Reason: MyOrchestrator. IsReplay: False. State: Scheduled. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 115.
2022-01-31 00:10:32.701 <GUID>: Function 'MyOrchestrator (Orchestrator)' awaited. IsReplay: False. State: Awaited. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 116.

-> Start activity function. It fails immediately with expected ex.Message, but still fails to run retry logic.

2022-01-31 00:10:32.715 <GUID>: Function 'PollForResponseAndDownload (Activity)' started. IsReplay: False. Input: (368 bytes). State: Started. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 117. TaskEventId: 5
2022-01-31 00:10:37.078 <GUID>: Function 'PollForResponseAndDownload (Activity)' failed with an error. Reason: System.Exception: RESPONSE_FILE_NOT_YET_AVAILABLE at MyNamespace.func._getResponseFileContents(String fileHeader) in C:\Users\me\source\AppName\func.cs:line ...
2022-01-31 00:10:37.364 <GUID>: Function 'MyOrchestrator (Orchestrator)' failed with an error. Reason: Microsoft.Azure.WebJobs.Extensions.DurableTask.FunctionFailedException: The activity function 'PollForResponseAndDownload' failed: "RESPONSE_FILE_NOT_YET_AVAILABLE". See the function execution logs for additional details. ---> System.Exception: RESPONSE_FILE_NOT_YET_AVAILABLE at ...

Here are the few links with related discussions. Can you try rechecking your function app according to this link for resolving your issue.

  • When the CallActivityWithRetryAsync call is made the DurableOrchestrationContext calls the ScheduleWithRetry method of the OrchestrationContext class inside the DurableTask framework .
  • There the Invoke method on the RetryInterceptor class is called and that does a foreach loop over the maximum number of retries. This class does not expose properties or methods to obtain the number of retries.

The activity function adds "Activity function 'SomeActivityFunc' failed: " to the message. So either create a custom exception type to be thrown and check type, use.Contains or check for "Activity function 'SomeActivityFunc' failed: RESPONSE_FILE_NOT_YET_AVAILABLE" instead.

Handle = ex => ex.Message == "Activity function 'SomeActivityFunc' failed: " + RETRY_ERROR_MESSAGE
Handle = ex => ex.Message.Contains(RETRY_ERROR_MESSAGE)
Handle = ex => ex is SomeCustomExceptionType

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM