简体   繁体   中英

In domain driven design, why would you use an “aggregate” object and what key principles would you apply when designing an aggregate?

I am new to DDD, and so am just understanding the basic concepts around it. Can someone please guide me more towards aggregate objects in DDD? particularly, why would you use an “aggregate” object and what key principles would you apply when designing an aggregate?

Thanks,

Let's start it from the beginning. A long time ago in a galaxy far, far away there were SQL databases with ACID transactions . What we are really interested here are the atomicity and consistency from the ACID ancronym. For example if you have two changes a1 -> a2 and b1 -> b2 and you make them in such a transaction then the changes must be atomic and you will have only 2 valid state options: a1, b1 and a2, b2 . So if the a1 -> a2 change fails, then the whole transaction fails. This is called immediate consistency.

In contrast there are noSQL databases, which are not ACID compliant. By these databases your changes are not atomic and you can have multiple states: a1, b1 , a2, b1 , a1, b2 , a2, b2 , depending on what order the changes are made or which changes fail. This is called eventual consistency.

If you have a distributed system with complex changes which involve multiple computers, then you have two options. You can use eventual consistency, which will be very fast, but the data won't be consistent on multiple machines. Or you can use immediate consistency with 2 phase commit , and the updates will be very slow, but the data will be consistent between the machines.

By DDD the aggregate is a consistency boundary: the inside changes with immediate consistency and the outside changes with eventual consistency. To stick with the same example, you want to change 2 things with your command: a1 -> a2 and b1 -> b2 . If these things are in the same aggregate x , then you can change them in the same transaction: (x.a1, x.b1) -> (x.a2, x.b2) using immediate consistency. If these things are in different aggregates: x , y , then you cannot change them in the same transaction, so you have to use eventual consistency: x.a1 -> x.a2 and y.b1 -> y.b2 will be the two transactions which will be committed independently from each other.

By DDD there is a rule according to Vernon's book; you cannot change multiple aggregates in a single transaction. If you do so, then it is a code smell, which is a sign to choose the consistency boundaries differently, or in other terms design the aggregates differently.

So by designing aggregates you have to keep in mind these consistency boundaries. If you don't do so, then it will cause concurrency issues. For example you have a and b properties on your aggregate x . These properties are independent from each other, so either (x.a1, x.b2) and (x.a2, x.b1) are valid states. If John wants to change x.a1 -> x.a2 and Jane wants to change x.b1 -> x.b2 in two concurrent requests, then one of the requests will fail despite the fact, that both scenarios: (x.a1, x.b1) -> (x.a2, x.b1) -> (x.a2, x.b2) and (x.a1, x.b1) -> (x.a1, x.b2) -> (x.a2, x.b2) would result the same state (x.a2, x.b2) and each of the steps are valid. So either John or Jane will have a bad day by working on the same aggregate simultaneously. Probably both of them if they send multiple concurrent requests. You can resolve this problem by creating a new aggregate y and moving the b property into that. So the changes will be x.a1 -> x.a2 and y.b1 -> y.b2 in two transactions, which won't cause trouble.

The inverse example if you have two aggregates x and y and the properties xa and yb cannot change independently. So the x.a1, y.b2 and x.a2, y.b1 states are invalid. This is a sign to merge these two aggregates into one and use immediate consistency instead of eventual consistency.

There is a good chance that your system will run on multiple machines. The bigger components, like bounded contexts, aggregates will be eventually consistent, while the small ones, like value objects, entities will be immediately consistent. Thus you can deploy your bounded contexts on multiple machines without distributed transactions and 2 phase commit, which will result a fast and reliable system. On the other hand the aggregates can have only valid states thanks to the transactions you use by them.

Note that I am not an expert of the topic, I just read a book. =]

1y later:

I found a very good article about aggregates . According to it you should put consistency boundaries around invariants to prevent contract violation. So if you have a list of invariants, then you can use them to define consistency boundaries. Aggregate boundaries will be similar. Ideally they include every invariant, but if they grow too big, they will result too many concurrency exceptions, so in complex cases they can't include some of the invariants.

An aggregate is a consistency boundary

It's a concept that allows you to specify which entities can be changed atomically, and which not.

This also makes aggregates the primary container for loading and persisting data through a repository, ie repositories handle whole aggregates, not single entities.

To find a suitable aggregate design, you need to fully understand your use cases and try to find out which entities need to be changed atomically in one transaction, and where you can resort to eventual consistency. Try to make aggregates small, but always consider the use cases first.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM