简体   繁体   English

B树中的最低占用率是多少?

[英]What is Minimum Occupancy in B-Trees?

I'm fairly new to the B-Tree concept, I'm currently reading slides for a course that can be found here: http://www-db.deis.unibo.it/courses/TBD/Lezioni/02%20-%20Indices.pdf 我对B树概念还不陌生,目前正在阅读课程的幻灯片,可以在这里找到: http : //www-db.deis.unibo.it/courses/TBD/Lezioni/02%20 -%20Indices.pdf

I read that B-trees have a "minimum occupancy" of 50%. 我读到B树的“最小占用率”为50%。

What does that mean? 这意味着什么? Is that a good percentage for minimum occupancy? 对于最小的占用率,这是一个很好的百分比吗? And is it better to have a higher/lower minimum occupancy? 拥有更高/更低的最低入住人数更好吗?

Thanks 谢谢

This answer applies to ENGINE = InnoDB. 此答案适用于ENGINE = InnoDB。

For all practical purposes, a given BTree is either "full" or 69% full. 出于所有实际目的,给定的BTree是“已满”或69%已满。 This does not address individual blocks. 这不解决单个块。

Individual blocks... 个别方块...

  • When initially loading a BTree in key order , it will be filled to 15/16 full. 最初以键顺序加载BTree ,它将填充为15/16的满度。

  • The "last" block can be nearly empty -- assuming the insert thinks that the tree is being appended to. “最后一个”块几乎可以是空的-假设插入文件认为该树已被追加。

  • When filling randomly, there will be block splits that leave two consecutive blocks at about 50% full, each. 随机填充时,将出现块拆分,从而使两个连续的块中的每个块充满约50%。

  • In the long run (continual churn and/or additions) to a BTree, it settles down to an average of about 69%. 从长远来看(持续不断的搅动和/或增加),BTree的平均沉降约为69%。 (This is a fact about BTrees.) (这是有关BTree的事实。)

  • When in the middle of a transaction, extra copies of rows may be placed in blocks; 在事务处理过程中,可能会将额外的行副本放在块中。 after cleanup, those go away. 清理后,那些消失了。

  • When two adjacent blocks are less than half full, the code may try to combine the blocks. 当两个相邻的块不足一半时,代码可能会尝试合并这些块。

  • InnoDB preallocates blocks, so some blocks (at any moment) are completely empty. InnoDB预分配了块,因此某些块(在任何时候)都是完全空的。

Some Database purveyors provide all sorts of tunables for min/max/etc occupancy. 一些数据库提供者为最小/最大/等占用率提供了各种可调参数。 MySQL follows the KISS principle; MySQL遵循KISS原则; nothing tunable. 没有什么可调整的。 The effect is that the BTrees are reasonably efficient. 效果是BTree相当有效。 Further, note that there are limited choices in indexing (for InnoDB): 此外,请注意,索引(对于InnoDB)的选择有限:

  • The PRIMARY KEY is unique and clustered; PRIMARY KEY是唯一的并且是集群的; no options here. 这里没有选择。
  • Secondary indexes (if any) are non-clustered and have the PRIMARY KEY column(s) in the leaf node. 二级索引(如果有)是非聚集的,并且在叶节点中具有PRIMARY KEY列。 That is, to locate the entire row via a secondary key, there are two BTree drill-downs. 也就是说,要通过辅助键定位整个行,有两个BTree向下钻取。

A Rule of Thumb (for InnoDB's 16KB blocks): about 100 items are in each node of a BTree. 经验法则(用于InnoDB的16KB块):BTree的每个节点中大约有100项。 Corollary: a trillion-row table or index will have about 6 levels in the BTree. 推论:一个万亿行的表或索引在BTree中将具有大约 6个级别。 (Now, isn't this paragraph simpler than those formulas, etc, in your link?) (现在,您的链接中的这一段不是比这些公式更简单吗?)

InnoDB employs "B+Trees", so sequential scans can walk from one leaf node to the next. InnoDB使用“ B +树”,因此顺序扫描可以从一个叶节点移动到下一个叶节点。

See also Wikipedia for another discussion of BTrees. 有关BTree的另一讨论,另请参阅Wikipedia。

Oh, back to the question about 50% -- That is "natural". 哦,回到50%的问题-那是“自然的”。 Think about what a "block split" (aka "leaf split") does -- take one full block and turn it into two adjacent half full blocks. 考虑一下“块拆分”(又称“叶拆分”)的作用-取一整块并将其变成两个相邻的半整块。 It does not make sense to ask for anything other than 50%. 要求50%以外的任何东西都没有道理。 (Yeah, you could split a full block into 3, but that seems wasteful. Or you could split before it is completely full, but then nothing much is gained by that.) (是的,您可以将一个完整的块分割为3,但这似乎很浪费。或者您可以在一个完整的块完全分割之前将其分割,但是这样做并不能带来太大的收益。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM