简体   繁体   English

Azure Durable Functions 的 CallActivityWithRetryAsync 不会在失败时重试

[英]Azure Durable Functions' CallActivityWithRetryAsync does not retry on failure

In my orchestrator function, I upload a request file to an external server.在我的编排器 function 中,我将请求文件上传到外部服务器。 After some non-deterministic amount of time, a response file should have been generated.经过一些不确定的时间后,应该已经生成了一个响应文件。 I need to poll for this file and download it.我需要轮询这个文件并下载它。

My current approach is to wait 10mins after upload.我目前的做法是上传后等待 10 分钟。 And then use the built-in CallActivityWithRetryAsync with RetryOptions .然后使用内置的CallActivityWithRetryAsyncRetryOptions After the first poll/download failure, wait 5mins before starting a total of 10 retry attempts.在第一次轮询/下载失败后,等待 5 分钟再开始总共 10 次重试尝试。 A retry should only be attempted if an exception with the message RESPONSE_FILE_NOT_YET_AVAILABLE is thrown within the activity function.只有在活动 function 中引发带有消息RESPONSE_FILE_NOT_YET_AVAILABLE的异常时,才应尝试重试。

        var nowIn10Minutes = ctx.CurrentUtcDateTime.AddMinutes(10);
        await ctx.CreateTimer(nowIn10Minutes, CancellationToken.None);

        const string RETRY_ERROR_MESSAGE = "RESPONSE_FILE_NOT_YET_AVAILABLE";
        var retryOptions = new RetryOptions(TimeSpan.FromMinutes(5), 10)
        {
            Handle = ex => ex.Message == RETRY_ERROR_MESSAGE
        };
        await ctx.CallActivityWithRetryAsync(nameof(PollForResponseAndDownload), retryOptions, input);

However, according to the logs, this retry logic is not honoured.但是,根据日志,这种重试逻辑没有得到遵守。 See below.见下文。

After waiting 10mins as set in the timer, the orchestration immediately fails with a FunctionFailedException.在计时器中设置的等待 10 分钟后,编排立即失败并出现 FunctionFailedException。 No retry is executed, though the correct exception message is shown in the logs.尽管日志中显示了正确的异常消息,但不执行重试。

Am I fundamentally misunderstanding the process?我从根本上误解了这个过程吗? Here are the relevant logs:以下是相关日志:

-> After uploading the request, wait 10mins -> 上传请求后,等待10分钟

2022-01-31 00:00:06.740 <GUID>: Function 'MyOrchestrator (Orchestrator)' is waiting for input. Reason: CreateTimer:2022-01-31T00:10:06.5093237Z. IsReplay: False. State: Listening. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 112.
2022-01-31 00:00:06.741 <GUID>: Function 'MyOrchestrator (Orchestrator)' awaited. IsReplay: False. State: Awaited. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 113.

-> Resume after 10mins, schedule activity function for execution -> 10分钟后恢复,安排活动function执行

2022-01-31 00:10:32.700 <GUID>: Function 'MyOrchestrator (Orchestrator)' was resumed by a timer scheduled for '2022-01-31T00:10:06.5093237Z'. IsReplay: False. State: TimerExpired. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 114.
2022-01-31 00:10:32.701 <GUID>: Function 'PollForResponseAndDownload (Activity)' scheduled. Reason: MyOrchestrator. IsReplay: False. State: Scheduled. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 115.
2022-01-31 00:10:32.701 <GUID>: Function 'MyOrchestrator (Orchestrator)' awaited. IsReplay: False. State: Awaited. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 116.

-> Start activity function. -> 开始活动 function。 It fails immediately with expected ex.Message, but still fails to run retry logic.它立即失败并显示预期的 ex.Message,但仍然无法运行重试逻辑。

2022-01-31 00:10:32.715 <GUID>: Function 'PollForResponseAndDownload (Activity)' started. IsReplay: False. Input: (368 bytes). State: Started. HubName: <HUB-NAME>. AppName: <APP-NAME>. SlotName: Production. ExtensionVersion: 2.6.0. SequenceNumber: 117. TaskEventId: 5
2022-01-31 00:10:37.078 <GUID>: Function 'PollForResponseAndDownload (Activity)' failed with an error. Reason: System.Exception: RESPONSE_FILE_NOT_YET_AVAILABLE at MyNamespace.func._getResponseFileContents(String fileHeader) in C:\Users\me\source\AppName\func.cs:line ...
2022-01-31 00:10:37.364 <GUID>: Function 'MyOrchestrator (Orchestrator)' failed with an error. Reason: Microsoft.Azure.WebJobs.Extensions.DurableTask.FunctionFailedException: The activity function 'PollForResponseAndDownload' failed: "RESPONSE_FILE_NOT_YET_AVAILABLE". See the function execution logs for additional details. ---> System.Exception: RESPONSE_FILE_NOT_YET_AVAILABLE at ...

Here are the few links with related discussions.以下是相关讨论的几个链接。 Can you try rechecking your function app according to this link for resolving your issue.您能否尝试根据此链接重新检查您的 function 应用程序以解决您的问题。

  • When the CallActivityWithRetryAsync call is made the DurableOrchestrationContext calls the ScheduleWithRetry method of the OrchestrationContext class inside the DurableTask framework .当调用DurableOrchestrationContext时, CallActivityWithRetryAsync调用DurableTask 框架OrchestrationContext class 的ScheduleWithRetry方法。
  • There the Invoke method on the RetryInterceptor class is called and that does a foreach loop over the maximum number of retries.调用RetryInterceptor class 上的Invoke方法,并在最大重试次数上执行 foreach 循环。 This class does not expose properties or methods to obtain the number of retries.此 class 不公开获取重试次数的属性或方法。

The activity function adds "Activity function 'SomeActivityFunc' failed: " to the message.活动 function 将“活动 function 'SomeActivityFunc' failed:”添加到消息中。 So either create a custom exception type to be thrown and check type, use.Contains or check for "Activity function 'SomeActivityFunc' failed: RESPONSE_FILE_NOT_YET_AVAILABLE" instead.因此,要么创建要抛出的自定义异常类型并检查类型,使用.Contains 或检查“Activity function 'SomeActivityFunc' failed: RESPONSE_FILE_NOT_YET_AVAILABLE”。

Handle = ex => ex.Message == "Activity function 'SomeActivityFunc' failed: " + RETRY_ERROR_MESSAGE
Handle = ex => ex.Message.Contains(RETRY_ERROR_MESSAGE)
Handle = ex => ex is SomeCustomExceptionType

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM