简体   繁体   中英

Is there a pattern for buffering related messages in Azure Service Bus for batch processing?

I'm building a solution which processes changes to records based on messages placed within an Azure Service Bus Queue. The queue contains an interleaved sequence of messages for changes to multiple different records, and an Azure Function is triggered per-message to process the changes and store them in a back-end service. Each message on the change queue relates to a single record, and contains a property: recordId .

The problem is, there's some overhead for calling the backend service with each message, and very often multiple messages appear on the queue for an individual record in quick succession. I'd like to allow multiple changes to 'buffer' for each record so that the backend service can be called less frequently, but with a batch of changes.

I'm looking at using the Sessions feature of Service Bus with an Azure Function monitoring the change queue before forwarding each message to a Session queue using the recordId as the Session Name. Ideally, I'd like Messages to buffer in the Session queue until either:

  1. There's a period of quiet for that recordId (eg no change for 10 secs)
  2. Or, a limit is reached for the Session size (eg 30 changes to the record)

The question is: How can I trigger processing of each session based on these scenarios? I've looked at using Scheduled Messages to handle the first scenario, but then I need a reliable way of re-scheduling to create a sliding timeout. Similarly, there doesn't seem to be a good way of monitoring the size of a session, apart from storing a counter somewhere.

I'm trying to solve the problem purely using Service Bus and Functions, although I'm very open to any other ideas.

There's a mismatch between what you want to achieve and the service intent and capabilities. Let's have a look at the sessions first.

Sessions intend to process messages in the order they were sent. This is a way for Azure Service Bus to provide you with a FIFO queue on top of an unordered queue you'd normally get and eliminate a chance of competing consumers processing messages, causing out-of-order processing. Session messages are "drained" from the queue and after the last one is consumed, after a configured session timeout, the consumer of the session will move on to the next available session.

Scheduled messages are messages sent in the future to delay their processing. Scheduling is done on a per-message base and cannot apply to a group such as a session. You could certainly orchestrate scheduling to be roughly for the same time but it would be tricky - how do you know when to schedule the first message when you haven't received the last one yet.

Batched receive with function is a way to request up-to a batch number of messages, doesn't mean you'll get that exact number. Saying that the combination of sessions and batched receive could work if all of the messages for the session are in the queue. While you might not process all the messages associated with a given session in a single function execution, you'd still be processing less frequently and in the order, in which those messages were sent.

To sum it up, you could reduce amount of function invocation by having IsBatched and IsSessionsEnabled set to true , and have messages delivered in batches for processing in the same order they were sent.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM