简体   繁体   中英

Parallel Durable Azure functions

I have a new durable function to replace a long running webjob and it works well and is faster than the former webjob however I have an issue with parallelism.

I understand all activities go onto a central work item Q, what this means is that items get processed in order, the issue I have is if there are 10 items in the backlog from user A and user B submits something then user B has to wait till all the data from user A has finished processing.

With the current webjobs we can autoscale and a new webjob will pickup the data for user B and process it in parallel to the existing processing.

Am I right in thinking the only way round this is to publish 2 copies of my function, one for each user/client to ensure one user isn't affected by another users backlog of data?

I tried chunking things onto the workitem Q so no single task put more than X items on the Q so that in theory there would be some sharing of the resource but that just slows things down as there is then less on the workitem Q and so the consumption plan autoscaling scales up very slowly due to the smaller volume on the workitem Q.

UPDATE

I should have been clearer about why I see the issue, the approx. Durable Function process is as follows:

  • Split file into pages
  • Fan Out putting an activity on the Q for each page
  • Fan In
  • Fan out putting another activity on the Q for each page (requires data from previous fan out to run)
  • Fan In
  • Insert information for pages into the DB in a single transaction
  • Mark the file as processed in the DB

So User A loads file 1 that has 1000 pages, then User B loads a file with 100 pages.

While I appreciate it processes the activity Q in parallel it does still pull things off the Q in order (I assume) so if there are 1000 items on the Q for User A's file when User B's file starts, the initial 100 page activities then get put on the activity Q after the 1000 and hence are "blocked" by them. Then by the time the 100 initial page activities are done there is a good chance the next fan out for the 1000 page document will have added more items to the activity Q again further blocking the progress of the 100 page document.

My issue is User A and B may be 2 different clients who would not expect their work to be blocked by another client's processing, hence my comment about having duplicate instances of the Durable Function and brokering messages between the multiple instances

Does that make a bit more sense?

It's true that activities go onto a central work item queue, but they do not get processed in order. They will actually get processed in parallel. The only way things would get processed in order is if there is only one orchestrator function and it intentionally sequences them (see function chaining ).

If work for user A and user B are done using different orchestration instances, or if its a single-instance which uses the fan-out, fan-in pattern , then you will get parallelization and don't have to worry about one user blocking another.

Also, just FYI, you can tweak the degrees of concurrency using host.json . More details can be found here: https://docs.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-perf-and-scale#concurrency-throttles

UPDATE

It is true that the queue is shared and large backlogs from one orchestration can cause delays in other orchestrations. In that case there are two possible solutions:

  1. Add more function app instances to process the backlog faster. This is done for you automatically in the Azure Functions consumption plan, and is done so continuously until the latency for this shared queue becomes suffiently low.
  2. Create a separate function app with a second task hub for different priority jobs. Even if you use the same storage account, each task hub will have its own set of queues, so heavy load on one app will not impact the other.

I realize these are not perfect solutions because they don't necessarily ensure fairness. If fairness is a strict requirement, then new features may need to be added to support it (BTW, feature requests can be made in the Durable Functions GitHub repo .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM