[英]What is Minimum Occupancy in B-Trees?
I'm fairly new to the B-Tree concept, I'm currently reading slides for a course that can be found here: http://www-db.deis.unibo.it/courses/TBD/Lezioni/02%20-%20Indices.pdf 我对B树概念还不陌生,目前正在阅读课程的幻灯片,可以在这里找到: http : //www-db.deis.unibo.it/courses/TBD/Lezioni/02%20 -%20Indices.pdf
I read that B-trees have a "minimum occupancy" of 50%. 我读到B树的“最小占用率”为50%。
What does that mean? 这意味着什么? Is that a good percentage for minimum occupancy? 对于最小的占用率,这是一个很好的百分比吗? And is it better to have a higher/lower minimum occupancy? 拥有更高/更低的最低入住人数更好吗?
Thanks 谢谢
This answer applies to ENGINE = InnoDB. 此答案适用于ENGINE = InnoDB。
For all practical purposes, a given BTree is either "full" or 69% full. 出于所有实际目的,给定的BTree是“已满”或69%已满。 This does not address individual blocks. 这不解决单个块。
Individual blocks... 个别方块...
When initially loading a BTree in key order , it will be filled to 15/16 full. 最初以键顺序加载BTree 时 ,它将填充为15/16的满度。
The "last" block can be nearly empty -- assuming the insert thinks that the tree is being appended to. “最后一个”块几乎可以是空的-假设插入文件认为该树已被追加。
When filling randomly, there will be block splits that leave two consecutive blocks at about 50% full, each. 随机填充时,将出现块拆分,从而使两个连续的块中的每个块充满约50%。
In the long run (continual churn and/or additions) to a BTree, it settles down to an average of about 69%. 从长远来看(持续不断的搅动和/或增加),BTree的平均沉降约为69%。 (This is a fact about BTrees.) (这是有关BTree的事实。)
When in the middle of a transaction, extra copies of rows may be placed in blocks; 在事务处理过程中,可能会将额外的行副本放在块中。 after cleanup, those go away. 清理后,那些消失了。
When two adjacent blocks are less than half full, the code may try to combine the blocks. 当两个相邻的块不足一半时,代码可能会尝试合并这些块。
InnoDB preallocates blocks, so some blocks (at any moment) are completely empty. InnoDB预分配了块,因此某些块(在任何时候)都是完全空的。
Some Database purveyors provide all sorts of tunables for min/max/etc occupancy. 一些数据库提供者为最小/最大/等占用率提供了各种可调参数。 MySQL follows the KISS principle; MySQL遵循KISS原则; nothing tunable. 没有什么可调整的。 The effect is that the BTrees are reasonably efficient. 效果是BTree相当有效。 Further, note that there are limited choices in indexing (for InnoDB): 此外,请注意,索引(对于InnoDB)的选择有限:
PRIMARY KEY
is unique and clustered; PRIMARY KEY
是唯一的并且是集群的; no options here. 这里没有选择。 PRIMARY KEY
column(s) in the leaf node. 二级索引(如果有)是非聚集的,并且在叶节点中具有PRIMARY KEY
列。 That is, to locate the entire row via a secondary key, there are two BTree drill-downs. 也就是说,要通过辅助键定位整个行,有两个BTree向下钻取。 A Rule of Thumb (for InnoDB's 16KB blocks): about 100 items are in each node of a BTree. 经验法则(用于InnoDB的16KB块):BTree的每个节点中大约有100项。 Corollary: a trillion-row table or index will have about 6 levels in the BTree. 推论:一个万亿行的表或索引在BTree中将具有大约 6个级别。 (Now, isn't this paragraph simpler than those formulas, etc, in your link?) (现在,您的链接中的这一段不是比这些公式更简单吗?)
InnoDB employs "B+Trees", so sequential scans can walk from one leaf node to the next. InnoDB使用“ B +树”,因此顺序扫描可以从一个叶节点移动到下一个叶节点。
See also Wikipedia for another discussion of BTrees. 有关BTree的另一讨论,另请参阅Wikipedia。
Oh, back to the question about 50% -- That is "natural". 哦,回到50%的问题-那是“自然的”。 Think about what a "block split" (aka "leaf split") does -- take one full block and turn it into two adjacent half full blocks. 考虑一下“块拆分”(又称“叶拆分”)的作用-取一整块并将其变成两个相邻的半整块。 It does not make sense to ask for anything other than 50%. 要求50%以外的任何东西都没有道理。 (Yeah, you could split a full block into 3, but that seems wasteful. Or you could split before it is completely full, but then nothing much is gained by that.) (是的,您可以将一个完整的块分割为3,但这似乎很浪费。或者您可以在一个完整的块完全分割之前将其分割,但是这样做并不能带来太大的收益。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.