[英]guidance on precomputed SQL attributes
Often I deal with aggregate or parent entities which have attributes derived from their constituent or children members. 我经常会处理具有从其组成或子成员派生的属性的聚合或父实体。 For example:
例如:
The byte_count
and packet_count
of a TcpConnection
object is computed from the same attributes of its two constituent TcpStream
objects, which in turn are computed from their constituent TcpPacket
objects. 的
byte_count
和packet_count
一个的TcpConnection
对象从它的两个组成的相同的属性计算TcpStream
对象,而这又是从它们的组成计算TcpPacket
对象。
An Invoices
object might have a total
which is basically the SUM() of its constituent InvoiceLineItems
' prices, with a little freight, discount and tax logic thrown in. 一个
Invoices
对象的total
可能基本上是其组成InvoiceLineItems
价格的SUM(),并InvoiceLineItems
了一些运费,折扣和税收逻辑。
When dealing with millions of packets or millions of invoiced line items (I wish!), on-demand computation of these derived attributes -- either in a VIEW or more commonly in presentation logic like reports or web interfaces -- is often unacceptably slow. 当处理数百万个数据包或数百万个已开票的订单项时(我希望!),按需计算这些派生属性的速度(无论是在VIEW中还是在报表或Web界面等表示逻辑中更常见)通常会令人无法接受。
How do you decide, before performance concerns force your hand, whether to "promote" derived attributes to precomputed fields? 在性能问题迫使您动手之前,您如何决定是否将衍生属性“提升”到预先计算的字段?
I personally wouldn't denormalize until performance trade-offs force my hand (because the downside of denormalizations are too drastic IMHO), but you might also consider: 在性能折衷迫使我动手之前,我个人不会取消规范化(因为规范化的缺点太严重了,恕我直言),但是您可能还会考虑:
Ref: The Database Programmer: The Argument for Denormalization . 参考: 数据库程序员:关于非规范化的争论 。 Be sure to read as well his article on Keeping Denormalized Values Correct - his recommendation is to use triggers.
一定要阅读他的文章, 保持正确的非规范化值正确 -他的建议是使用触发器。 That brings home the kind of trade-off denormalization requires.
这就带来了需要权衡的非规范化。
Basically, you don't. 基本上,您不需要。 You left performance concerns force your hand.
您对性能的担心会迫使您动手。
That's the best answer because 99% of the time, you should not be pre-optimizing like this, it's better to just calc it on the fly. 这是最好的答案,因为99%的时间,你不应该预先优化这样的,最好是刚calc下它的飞行。
However, it is quite common for client-application developers to come to the server-side with mistaken preconceptions like " on-demand computation of ...derived attributes... -- is often unacceptably slow ", and this just IS NOT true. 但是,客户端应用程序开发人员带着错误的先入之见来到服务器端是很普遍的,例如“ 按需计算...派生属性...- 常常慢得令人无法接受 ”,这是不正确的。 。 The correct wording here would be " is rarely unacceptably slow ".
此处正确的措词是“ 很少会令人无法接受地缓慢 ”。
As such, unless you are an expert in this (a DB Development Architect, etc.), you should not be engaging in premature optimization. 因此,除非您是此方面的专家(DB开发架构师等),否则您不应该从事过早的优化。 Wait until it's obvious that is has to be fixed, then look at pre-aggregation.
等到很明显 ,就是已经被固定, 再看看前聚集。
How current the data must be determines how you implement it, really. 数据必须是最新的,这实际上决定了如何实现它。
I'll assume 2 simple states: current or not current. 我将假设2个简单状态:当前或不当前。
That said, I would develop against the same quantity of data as I have in prod so I have some confidence in response time. 就是说,我将使用与生产相同数量的数据进行开发,因此我对响应时间充满信心。 You should rarely be surprised by your code performance...
您应该很少对代码性能感到惊讶...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.