简体繁体 English

如何正确组织MySQL数据库中的相关表？

[英]How to properly organize related tables in MySQL database?

原文 2021-07-20 13:46:47 0 2 mysql/ join/ sum/ relational-database

There are two tables - users and orders:有两个表 - 用户和订单：

id ID	first_name名	orders_amount_total订单数量_总计
1 1	Jone琼斯	5634200 5634200
2 2	Mike麦克风	3982830 3982830

id ID	user_id用户身份	order_amount订单金额
1 1	1 1	200 200
2 2	1 1	150 150
3 3	2 2	70 70
4 4	1 1	320 320
5 5	2 2	20 20
6 6	2 2	10 10
7 7	2 2	85 85
8 8	1 1	25 25

The tables are linked by user id.这些表由用户 ID 链接。 The task is to show for each user the sum of all his orders, there can be thousands of them (orders), maybe tens of thousands, while there can be hundreds and thousands of users simultaneously making a request.任务是为每个用户显示他所有订单的总和，可能有数千个（订单），也可能有数万个，同时可能有成百上千的用户同时发出请求。 There are two options:有两种选择：

With each new order, in addition to writing to the orders table, increase the orders_amount_total counter, and then simply show it to the user.对于每个新订单，除了写入订单表外，增加 orders_amount_total 计数器，然后简单地将其显示给用户。
Remove the orders_amount_total field, and to show the sum of all orders using tables JOIN and use the SUM operator to calculate the sum of all orders of a particular user.删除 orders_amount_total 字段，并使用表 JOIN 显示所有订单的总和，并使用 SUM 运算符计算特定用户的所有订单的总和。

Which option is better to use?哪个选项更好用？ Why?为什么？ Why is the other option bad?为什么另一个选项不好？

PS I believe that the second option is concise and correct, given that the database is relational, but there are strong doubts about the load on the server, because the sample when calculating the amount is large even for one user, and there are many of them. PS我认为第二个选项简洁正确，鉴于数据库是关系型的，但对服务器上的负载存在强烈怀疑，因为即使对于一个用户计算量时的样本也很大，并且有很多他们。

2 个解决方案

Option 2. is the correct one for the vast majority of cases.选项 2. 对于绝大多数情况是正确的。

Option 1. would cause data redundancy that may lead to inconsistencies.选项 1. 会导致可能导致不一致的数据冗余。 With option 2. you're on the safe side to always get the right values.使用选项 2. 您可以安全地始终获得正确的值。

Yes, denormalizing tables can improve performance.是的，非规范化表可以提高性能。 But that's a last resort and great care needs to be taken.但这是最后的手段，需要格外小心。 "tens of thousands" of rows isn't a particular large set for an RDMBS. “数万”行对于 RDMBS 来说并不是一个特别大的集合。 They are built to handle even millions and more pretty well.它们旨在处理甚至数百万甚至更多。 So you seem to be far away from the last resort and should go with option 1. and proper indexes.所以你似乎离最后的手段还很远，应该选择选项 1. 和适当的索引。

I agree with @sticky_bit that Option 2. is better than 1. There's another possibility:我同意@sticky_bit 选项 2. 比 1 更好。还有另一种可能性：

Create a VIEW that's a pre-defined invocation of the JOIN / SUM query.创建一个VIEW ，它是JOIN / SUM查询的预定义调用。 A smart DBMS should be able to infer that each time the orders table is updated, it also needs to adjust orders_amount_total for the user_id .智能 DBMS 应该能够推断出每次更新orders表时，它还需要为user_id调整orders_amount_total 。

BTW re your schema design: don't name columns id ;顺便说一下你的架构设计：不要命名列id ； don't use the same column name in two different tables except if they mean the same thing.不要在两个不同的表中使用相同的列名，除非它们的意思相同。