简体   繁体   中英

Can DDD/CQRS Aggregate Roots be microservices?

I'm currently designing the new system I strongly believe CQRS+ES would be a perfect fit for. I want to verify whether my "large scale" design presuppositions are sounding good and that I'm not moving in a wrong direction.

To me, it seems like a good idea to make each Aggregate Root (write model) live in its own microservice, due to coinciding consistency and network boundaries.

I think it's safe to assume that each aggregate instance may have just its own event stream, due to consistency guarantees. In practice, winding up an event store for each aggregate instance would be an overkill, but sharing one or few replicated event stores between the microservices of same aggregate type seems sane choice to me. If replication is not desired, we can even shard event stores based on the aggregate ID.

This way, with few aggregate-type-scoped microservices, each handling commands for lots of aggregate instances, scaling should be transparent enough to the rest of the system.

Then it makes sense for me to let Projectors (read models) live in their own microservices as well, each with their own DB, which should be shared between projectors of the same type.

Because projectors' query interface isn't very outer-world friendly (I assume that projectors provide Repository-like interface for issuing queries, which in my opinion doesn't play well with access control, rate limiting etc.), each projector should provide some unified network interface that would be used by BFF (backend-for-frontend), actually serving some API endpoints, ensuring security, providing versioning and so on.

CQRS +微服务架构的绘制示意图

tl;dr: I'm providing a graphical representation of above (encircled) to compensate my bad wording with my bad drawing. PS: Replay service is the service that monitors new events as they are appended into event store, and broadcasts them to the interested subscribed projectors or process managers (not drawn), OR replays entire event sequence for projectors with stale or empty DBs.

Does this CQRS+microservices adaptation sound good, or have I fundamentally misunderstood something and the whole design is trash?

UPD1:

Why do I have load balancer between event source and projectors and how do I balance the load?

If I'm spawning multiple projector instances of the same type to handle additional load from heavy queries, how they are going to listen for events? To me it seems strange to allocate just one instance to do all the event processing job, updating DB and so on, because with the load increase it's likely to get overloaded. So, it makes sense to distribute event handling as well, right?

Also, while I've been writing this, I've though whether it would be a good idea to further split projectors into "projector writers" (those listening for events and updating shared state in DB), and "projector readers" (those listening for queries and returning state), with the DB serving as a source of truth and consolidation point. This way we can better scale for asymmetric loads (little events, lots of queries) for no cost.

One of the necessary conditions is to prevent different projector writer instances from simultaneously handling events from the same aggregate, because updating the representation with out-of-order events will lead to the loss of internal consistency and immediate disaster.

As for "how", I can think of few solutions:

  1. Keep single RabbitMQ queue for all incoming events, and make all projectors consume events from the queue with acknowledgements. Once DB has been updated, projector writer issues an ack to RabbitMQ and event is discarded from the queue. Otherwise, if projector writer dies for some reason, event will be requeued again to the next projector in line.

    For each aggregate we should keep the height/revision number, and only allow UPDATE to succeed if the next revision number (contained in the event) is exactly one more than current revision number. If this condition isn't true, then requeue this event in hope inconsistency would be resolved by then, and grab the next one.

    Eventually, it will complete, and given large enough number of aggregates it won't ever need to requeue.

  2. Put some sort of dispatcher service to listen for events, one dispatcher per projector type. This dispatcher should distribute events to projector writers based on the hash of aggregate ID, so, the same aggregate is always handled by the same aggregate with index hash(event.aggregateId) % numberOfProjectorWriters .

    This will never requeue, but terminates MQ early, introduces single point of failure, and will screw up if number of projector writers changes due to some nodes dying, or scaling dynamically, or...

  3. Somehow use header exchanges to implement the combo of #1 and #2, to make consumers prefer the same set of aggregate IDs, but without screwing up if number of consumers changes in the middle.

I believe, that though technically aggregate roots can live in their own microservices, such level of granularity may bring non-needed complexity. Usually they say, start with well-architected monolith.

Usually, if there are a few aggregates in a BC, they may share some services, repositories, so such aggregates together brings sensible reduction in complexity, as well as makes a cohesive component. But it may depend on scaling needs.

By the way, what event store consistency guarantees do you mean? The most useful guarantee for event stream is the ordering of events. Which is really hard to achieve in distributed environment. It would be great, if you provide a link to such guarantees for candidate event stores.

I agree with you, that read models may be located in separate microservices and be subject to a separate scaling. Actually, read model managers can be separate reactive microservices. But read models, that these managers produce, are plain resources, which can be stored in a single read-optimized, scaled and sharded resource storage with simplified REST API, where read-model managers just PUT prepared resources, tagging them with some timeout and other metadata.

More than that, aggregates may have REST API as well. Commands and events can be resources that such an API serves. You can POST a command to an aggregate, get back a URL to the command's result, GET event of an aggregate, or GET consolidated event stream for the aggregate class...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM