简体繁体 English

Azure 持久框架 Function 应用程序非常慢

[英]Azure Durable Framework Function App VERY Slow

原文 2020-09-21 07:58:45 3 1 c#/ azure/ azure-functions/ azure-durable-functions

I have made an app that uses azure functions durable fan-out strategy to make parallel inquiries and updates to a database by sending http requests to our own internal API.我制作了一个应用程序，它使用 azure 函数持久扇出策略，通过向我们自己的内部 API 发送 http 请求来对数据库进行并行查询和更新。

I found out that the fan-out strategy is EXTREMELY slower than it would be to use TPL library and doing parallelism this way on a normal .net Core webapp.我发现扇出策略比使用TPL 库并在正常的 .net Core webapp 上以这种方式进行并行处理要慢得多。 It's not just slower it's about 20 times slower.它不仅慢，而且慢了大约 20 倍。 It takes 10 minutes for 130 updates while the .net core 3.1 app that I've made for speed comparisson that does the exact same thing does 130 updates in 0.5 minutes and in a significantly lower plan. 130 次更新需要 10 分钟，而我为速度比较制作的 .net 核心 3.1 应用程序执行完全相同的操作，在 0.5 分钟内完成 130 次更新，而且计划要低得多。

I understand there is a latency becuase of the durable framwork infrastructure (communicating with the storage account and whatnot) but I don't see how that speed difference is normal.我知道由于持久的框架基础设施（与存储帐户等通信）存在延迟，但我不明白这种速度差异是如何正常的。 Each individual update happens in an ActivityTrigger function and the orchestrator is the one that gathers all the necessary updates and puts them in a Task.WhenAll() call just like the example from Microsoft docs.每个单独的更新都发生在 ActivityTrigger function 中，编排器负责收集所有必要的更新并将它们放入 Task.WhenAll() 调用中，就像 Microsoft 文档中的示例一样。

Am I doing something wrong here?我在这里做错了什么吗？ Is this business scenario maybe not compatible with this technology?该业务场景是否可能与该技术不兼容？ The code seems to work fine and the parallelism works It's just a LOT slower than the .net core app.代码似乎工作正常并且并行性有效它只是比 .net 核心应用程序慢很多。 Another thing to mention is that the moment the function opens a second instance (due to either it being in consuption plan and naturally openning a second instance to deal with heavy load or it being in appservice plan and I manually opening an instance) it goes even slower although the cpu load somehow balances in the two instances.另一件要提到的事情是，当 function 打开第二个实例时（由于它处于消费计划中并且自然地打开第二个实例来处理重负载或者它处于应用程序服务计划中并且我手动打开一个实例）它甚至虽然 cpu 负载在这两个实例中以某种方式平衡，但速度较慢。 I suspect this could be extra latency due to azure queue comunication between the two instances but I'm not entirely sure.我怀疑这可能是由于两个实例之间的 azure 队列通信导致的额外延迟，但我不完全确定。

One last detail is that the app also has a TimeTrigger that does a simple select in a database every one minute (nothing even remotely cpu intensive but it might play a role in the performance).最后一个细节是该应用程序还有一个 TimeTrigger，它每隔一分钟在数据库中执行一个简单的 select（即使是远程 cpu 密集型操作，但它可能会影响性能）。

I've tried the function app in a premium plan, consuption plan, and appservice plan and it seems to top at 130 updates in 10 minutes no matter how huge the plan is.我已经在高级计划、消费计划和应用服务计划中尝试了 function 应用程序，无论计划有多大，它似乎都在 10 分钟内达到 130 次更新。

1 个解决方案

Speaking generally, TPL will almost always be much faster than Durable Functions because all the coordination is done in-memory (assuming to don't completely exhaust system resources doing everything on one machine).一般来说，TPL 几乎总是比 Durable Functions 快得多，因为所有协调都是在内存中完成的（假设在一台机器上做所有事情不会完全耗尽系统资源）。 So that part is often expected.所以这部分通常是预期的。 Here are a few points worth knowing:以下是一些值得了解的要点：

Each fan-out to an activity function involves a set of queue transactions: one message for calling the activity function and one message for handing the result back to the orchestrator.活动 function 的每个扇出都涉及一组队列事务：一条消息用于调用活动 function，一条消息用于将结果返回给编排器。 When there are multiple VMs involved, then you also have to worry about queue polling delays.当涉及多个 VM 时，您还必须担心队列轮询延迟。
By default, the per-instance concurrency for activity functions is limited to 10 on a single-core VM.默认情况下，活动功能的每个实例并发在单核 VM 上限制为 10。 If your activity functions don't require much memory or CPU, then you'll want to crank up this value to increase per-instance concurrency.如果您的活动函数不需要太多 memory 或 CPU，那么您将需要调高此值以增加每个实例的并发性。
If you're using the Azure Functions Consumption or Premium plans, it will take 15-30 seconds before new instances get added for your app.如果您使用的是 Azure Functions Consumption 或 Premium 计划，则需要 15-30 秒才能为您的应用程序添加新实例。 This matters mainly if your workload can be done faster by running on multiple machines.如果您的工作负载可以通过在多台机器上运行来更快地完成，那么这很重要。 The amount of time a message spends waiting on a queue is what drives scale-out (1 second is considered too long).消息在队列上等待的时间量是驱动横向扩展的因素（1 秒被认为太长）。

You can find more details on this in the Durable Functions Performance and Scale documentation .您可以在Durable Functions Performance and Scale 文档中找到更多详细信息。

One last thing I will say is the key value add of Durable Functions is orchestrating work in a reliable way in a distributed environment.我要说的最后一件事是 Durable Functions 的关键增值是在分布式环境中以可靠的方式编排工作。 However, if your workload isn't long-running, doesn't require strict durability/resilience, doesn't require scale-out to multiple VMs, and if you have strict latency requirements, then Durable Functions might not be the right tool.但是，如果您的工作负载不是长期运行的，不需要严格的持久性/弹性，不需要横向扩展到多个 VM，并且如果您有严格的延迟要求，那么 Durable Functions 可能不是合适的工具。 If you just need a single VM and want low latency, then a simple function that uses in-memory TPL may be a better choice.如果您只需要一个虚拟机并希望低延迟，那么使用内存中 TPL 的简单 function 可能是更好的选择。