简体   繁体   English

回滚数据库更改的持久功能

[英]Rollback database changes in a durable function

Let's say I have the following orchestration: 假设我有以下业务流程:

[FunctionName("Orchestration")]
public static async Task Orchestration_Start([OrchestrationTrigger]  DurableOrchestrationContext ctx)
{
    await ctx.CallActivityAsync("Foo");
    await ctx.CallActivityAsync("Bar");
    await Task.WhenAll(ctx.CallActivityAsync("Baz"), ctx.CallActivityAsync("Baz"));
}

All my activities utilize an Azure SQL database, and if any of the calls fails, I want to be undo all the changes made by previous activities - so for example if the second call to Baz throws an exception, I want to undo everything done by Foo , Bar and if the first Baz has completed, I want to undo its modifications too. 我所有的活动都利用Azure SQL数据库,并且如果任何调用失败,我想撤消之前活动的所有更改-例如,如果第二次对Baz调用引发异常,我想撤消由Baz完成的所有操作FooBar ,如果第一个Baz已完成,我也想撤消其修改。

In a non-Functions application, I'd be able to just wrap the entire body of the orchestration in a using scope = new TransactionScope() block. 在非功能应用程序中,我可以将业务流程的整个主体包装在using scope = new TransactionScope()块中。

Will this work for a potentially distributed orchestration, and if not, is there any analogous mechanism in the Azure Functions framework? 这是否适用于潜在的分布式流程,如果没有,Azure Functions框架中是否有任何类似的机制? Or am I required to write a rollback implementation for each of the activities and commit the changes to the database after completing each of them? 还是我需要为每个活动编写回滚实现,并在完成每个活动之后将更改提交到数据库?

Durable Functions implement a mechanism of eventual consistency . 持久功能实现了最终一致性的机制。 This is a quite different concept than other kinds of consistency(eg strong) as it guarantees, that a transaction will be completed eventually . 与保证一致性( 最终将完成交易)的其他一致性(例如强一致性)相比,这是一个完全不同的概念。 What does that mean? 这意味着什么?

By using TransactionScope you can ensure, that if anything goes wrong within a transaction, a rollback will be performed automatically. 通过使用TransactionScope ,可以确保如果TransactionScope任何错误,将自动执行回滚。 In Durable Function it is not the case - you have no automated feature, which gives you such functionality - in fact, if the second activity from your example fails, you will end up with an inconsistent data stored within a database. 在“持久功能”中,情况并非如此-您没有自动化功能,而是为您提供了这样的功能-实际上,如果示例中的第二个活动失败,最终将导致数据库中存储的数据不一致。

To implement a transaction in such scenario, you have to try/catch possible issue and perform logic, which will allow you to mitigate an error: 要在这种情况下实现事务,您必须尝试/捕获可能的问题并执行逻辑,这将使您减轻错误:

[FunctionName("Orchestration")]
public static async Task Orchestration_Start([OrchestrationTrigger]  DurableOrchestrationContext ctx)
{
    try 
    {
        await ctx.CallActivityAsync("Foo");
        await ctx.CallActivityAsync("Bar");
        await Task.WhenAll(ctx.CallActivityAsync("Baz"), ctx.CallActivityAsync("Baz"));
    }
    catch(Exception)
    {
        // Do something...
    }  
}

There is also a possibility to implement a retry policy to avoid transient errors: 还可以实施重试策略,以避免出现瞬时错误:

public static async Task Run(DurableOrchestrationContext context)
{
    var retryOptions = new RetryOptions(
        firstRetryInterval: TimeSpan.FromSeconds(5),
        maxNumberOfAttempts: 3);

    await ctx.CallActivityWithRetryAsync("FlakyFunction", retryOptions, null);

    // ...
}

However, the important thing is to understand how the runtime of Durable Functions really manages a situation, when something goes wrong. 但是,重要的是要了解在出现问题时持久功能的运行时是如何真正管理情况的。 Let us assume, that the following code fails: 让我们假设以下代码失败:

[FunctionName("Orchestration")]
public static async Task Orchestration_Start([OrchestrationTrigger]  DurableOrchestrationContext ctx)
{
    await ctx.CallActivityAsync("Foo");
    await ctx.CallActivityAsync("Bar"); // THROWS!
    await Task.WhenAll(ctx.CallActivityAsync("Baz"), ctx.CallActivityAsync("Baz"));
}

If you replay the whole orchestration, the first activity(the one with "Foo" passed) will not be executed once more - its state will be stored in a storage, so a result will be immediately available. 如果重播整个业务流程,则第一个活动(传递了“ Foo”的活动)将不再执行-其状态将存储在存储器中,因此结果将立即可用。 The runtime performs a checkpoint after each activity, so the state is preserved and it knows, where it finished previously. 运行时在每个活动之后执行一个检查点,因此状态得以保留并知道其先前完成的位置。

Now to handle a situation properly, you have to implement the following algorithm: 现在要正确处理情况,您必须实现以下算法:

  • perform a manual rollback when an exception was caught 捕获异常时执行手动回滚
  • if that fails, push a message to eg queue, which is then handled manually by someone, who understand how the process works 如果失败,则将消息推送到例如队列中,然后由了解该过程如何工作的某人手动处理

While initially, it may look like a big flaw, in fact, it is a perfectly fine solution - errors do occur so it is always a good idea to avoid transient ones(using retry), but if rollback fails, this clearly indicates that there is something wrong in your system. 虽然最初看起来可能是一个大缺陷,但实际上,这是一个非常好的解决方案-确实会发生错误,因此避免临时错误(使用重试)始终是个好主意,但是如果回滚失败,则表明存在在您的系统中有问题。

The choice is yours - whether you have strong consistency and have to deal with problems with scalability, or you use looser model which provides better scalability, but is more difficult to work with. 选择权是您自己-是要具有强大的一致性并必须解决可伸缩性问题,还是要使用松散的模型来提供更好的可伸缩性,但使用起来更困难。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM