简体繁体 English

汇总一张表并用结果更新其他表，或者只是在选择时执行“sum”？

[英]Sum one table and update other with result, or just do `sum` on select?

原文 2020-03-31 00:06:12 0 2 mysql/ sql/ relational-database/ denormalization

Might be a dummy question, but I'm creating a system in which users are not allowed to exceed X mounts of kb in pictures they can upload.可能是一个愚蠢的问题，但我正在创建一个系统，在该系统中，用户不能在他们可以上传的图片中超过 X 个 kb 安装量。

When uploading a picture I update the images table with the size of the image in KB and other info.上传图片时，我会使用图片大小（以 KB 为单位）和其他信息更新images表。

Now, should I also keep track of the total size of each users images on the users table?现在，我还应该跟踪users表上每个用户图像的总大小吗？ Or should I just do a select sum(size) from images where user = xxx every time I want to check the limit?或者我应该select sum(size) from images where user = xxx每次我想检查限制时select sum(size) from images where user = xxx ？ Which might with every new upload?每次新上传可能会出现哪些情况？

What would be the best approach from a relational point of view?从关系的角度来看，最好的方法是什么？

2 个解决方案

You can use either method.您可以使用任一方法。

However, because you have a business rule related to the sum of the sizes, I might suggest that you use triggers to maintain the sum at the user level.但是，因为您有一个与大小总和相关的业务规则，我可能建议您使用触发器来维护用户级别的总和。 Although this has some additional overhead for insert s/ update s/ delete s, it has much less overhead when returning information about a user.虽然这对于insert s/ update s/ delete s 有一些额外的开销，但在返回有关用户的信息时它的开销要小得多。

This has a few other advantages as well:这还有其他一些优点：

You can impose business rules on the sizes.您可以对大小强加业务规则。 For instance, you can round the sizes up to the nearest 1k and then sum them.例如，您可以将大小四舍五入到最接近的 1k，然后将它们相加。 You wouldn't want such business logic spread through multiple queries.您不希望这样的业务逻辑通过多个查询传播。
You can implement a check constraint directly in the users table (well, you can do this in the most recent versions of MySQL).您可以直接在users表中实现检查约束（好吧，您可以在最新版本的 MySQL 中执行此操作）。
You can index the total image size, so you easily see who is closest to their limit.您可以索引总图像大小，以便您轻松查看谁最接近他们的限制。

Storing the SUM in the users table is one type of denormalization.将 SUM 存储在users表中是一种非规范化。

This can be worthwhile if you need to query the sum frequently, and it's too slow to do the aggregate query every time you need it.如果您需要经常查询总和，并且每次需要时都进行聚合查询太慢，那么这可能是值得的。

But you accept the risk that the stored sum in the users table will become out of sync with the real SUM(size) of the associated images.但是您接受这样的风险，即users表中存储的总和将与相关图像的实际SUM(size)不同步。

You wouldn't think this would be difficult, but in practice, there are lots of edge case where the stored sum fails to be updated.您不会认为这会很困难，但在实践中，存在许多边缘情况，即存储的总和无法更新。 You will end up periodically running the aggregate query in the background, to overwrite the stored sum, just in case it has gotten out of sync.您将最终在后台定期运行聚合查询，以覆盖存储的总和，以防万一它不同步。

Denormalization is more work for you as a coder, because you have to write code to correct for anomalies like that.作为一名编码人员，非规范化对你来说是更多的工作，因为你必须编写代码来纠正这样的异常。 Be conservative about how many cases of denormalization you create, because each one obligates you to do more work.对于您创建的非规范化案例的数量要保守，因为每个案例都要求您做更多的工作。

But if it's very important that your query for the sum return the result faster than is possible by running the aggregate query, then that's what you have to do.但是，如果您对总和的查询比运行聚合查询更快地返回结果非常重要，那么这就是您必须做的。

In my experience, all optimizations come with a price.根据我的经验，所有优化都是有代价的。