简体   繁体   English

b+树的最大和最小高度可以是相同的值吗?

[英]Can the max and min height of b+ tree be the same value?

I have this information:我有这个信息:

  1. 10 million records. 1000 万条记录。
  2. Each record is 256 bytes.每条记录为 256 字节。
  3. Each record has a 32-bytes index field.每条记录都有一个 32 字节的索引字段。
  4. Our system's disk block size is 8KB.我们系统的磁盘块大小为 8KB。
  5. Our system is 32-bits, so the pointers are 4-bytes.我们的系统是 32 位的,所以指针是 4 字节。 (use this for the links.). (将其用于链接。)。

I want to find the max and minimum height of this B+ tree.我想找到这棵 B+ 树的最大和最小高度。 and when I did had the same value for the max and min value is that possible?当我确实有相同的最大值和最小值时,这可能吗? if not can you point to the mistake?如果不是,你能指出错误吗?

The steps I took我采取的步骤

DATA NODE Related Calculations数据节点相关计算

assume L as the data count in a Data Node (leaf node)假设 L 作为数据节点(叶节点)中的数据计数

  • Node size = Data count * Record size节点大小 = 数据数 * 记录大小
  • 8KB = L * 256 bytes 8KB = L * 256 字节
  • 8192 bytes = L * 256-bytes 8192 字节 = L * 256 字节
  • L = 8192 / 256 L = 8192 / 256
  • L = 32 L = 32

Remember: Data nodes can store up to 32 Records, So.记住:数据节点最多可以存储 32 条记录,所以。 the maximum data count for each data node is 32, This means, a data node can store at least 32/2=16 data at least/minimum!每个数据节点的最大数据数为 32,这意味着,一个数据节点至少可以存储 32/2=16 个数据至少/最少!

So所以

  • Min data count is 16最小数据数为 16
  • Max data count is 32最大数据数为 32

INDEX NODE Related Calculations INDEX NODE 相关计算

  • Node size = (M – 1) * index/key + M * links (pointers)节点大小 = (M – 1) * 索引/键 + M * 链接(指针)
  • 8KB = (M – 1) * 32 bytes + M * 4 bytes 8KB = (M – 1) * 32 字节 + M * 4 字节
  • 8192 bytes = (M – 1) * 32 + 4* M 8192 字节 = (M – 1) * 32 + 4* M
  • 8192 = 36M – 32 8192 = 36M – 32
  • 36M = 8224 36M = 8224
  • M = 228米 = 228

Our data nodes can store up to 32 records maximum and 16 records minimum.我们的数据节点最多可以存储 32 条记录,最少可以存储 16 条记录。 Our index nodes can store up to 227 indexes maximum, 113 indexes minimum.我们的索引节点最多可以存储 227 个索引,最少 113 个索引。

Finding the minimum:寻找最小值:

  • 10 million records = 32 records * leaf/data node count 1000 万条记录 = 32 条记录 * 叶/数据节点数
  • Leaf node count = 10 million / 32叶节点数 = 1000 万 / 32
  • Leaf node count = 312500叶节点数 = 312500

So, to store 10 million records we need 312500 FULL data nodes!因此,要存储 1000 万条记录,我们需要 312500 个完整数据节点!

For 312500 data nodes we need to have 312500 links from index nodes!对于 312500 个数据节点,我们需要来自索引节点的 312500 个链接!


  • 312500 = link count * index nodes 312500 = 链接数 * 索引节点
  • 312500 = 228 * index nodes 312500 = 228 * 索引节点
  • index nodes = 1370索引节点 = 1370
  • Since 1370 is bigger than 228. This means we need to have upper index nodes!由于 1370 大于 228。这意味着我们需要有更高的索引节点!

  • 1370 = 228 * upper index nodes 1370 = 228 * 上索引节点
  • upper index nodes = 6 which is less than 228. This means we need a root node on top of these 6 index nodes!上索引节点 = 6 小于 228。这意味着我们需要在这 6 个索引节点之上的根节点!

So min height is 3:所以最小高度是3:

在此处输入图像描述

Finding the maximum:寻找最大值:

  • 10 million records = 16 records * leaf/data node count 1000 万条记录 = 16 条记录 * 叶/数据节点数
  • Leaf node count = 10 million / 16叶节点数 = 1000 万 / 16
  • Leaf node count = 625000叶节点数 = 625000

So, to store 10 million records we need 625000 FULL data nodes!因此,要存储 1000 万条记录,我们需要 625000 个 FULL 数据节点!

For 625000 data nodes we need to have 625000 links from index nodes!对于 625000 个数据节点,我们需要来自索引节点的 625000 个链接!


  • 625000 = link count * index nodes 625000 = 链接数 * 索引节点
  • 625000 = 228 * index nodes 625000 = 228 * 索引节点
  • index nodes = 2741索引节点 = 2741

Since 2741 is bigger than 228. This means we need to have upper index nodes!由于 2741 大于 228。这意味着我们需要有上索引节点!


  • 2741 = 228 * upper index nodes 2741 = 228 * 上索引节点

upper index nodes = 12 which is less than 228. This means we need a root node on top of these 12 index nodes!上索引节点 = 12 小于 228。这意味着我们需要在这 12 个索引节点之上的根节点!

Question: So max height is 3 also?问题:所以最大高度也是3?

在此处输入图像描述

Yes that is perfectly possible.是的,这是完全可能的。 A difference is clear in the number of children that the root will get, which is tiny in the first case, and a multiple of that in the second case.根将获得的子节点数量明显不同,在第一种情况下很小,在第二种情况下是倍数。

I should note that you had a few errors.我应该注意到你有一些错误。 The main error is that in the last part ("finding the maximum"), you did not reduce the use of the index nodes to half their capacity.主要错误是在最后一部分(“寻找最大值”)中,您没有将索引节点的使用减少到其容量的一半。 You continued with 228, while you should have used 114 instead.您继续使用 228,而您应该使用 114。

Secondly, in the first case, you need to round the divisions upwards, as you will need an extra node to cover for the remainder of the division (with some redistribution of keys from a neighboring node).其次,在第一种情况下,您需要向上舍入分区,因为您需要一个额外的节点来覆盖分区的其余部分(从相邻节点重新分配一些键)。 In the second case it is right to round downwards, meaning that you need to add the remainder of the division to a node (as it has room for it).在第二种情况下,向下舍入是正确的,这意味着您需要将除法的其余部分添加到节点(因为它有空间)。

These corrections do not influence the final conclusion that 3 levels are needed in both cases.这些更正不影响在两种情况下都需要 3 个水平的最终结论。 An overview:概述:

Fill style填充样式 Records / leaf记录/叶 Keys / index键/索引 Data nodes数据节点 Level 3 3级 Level 2 2级 Level 1 1级
compact袖珍的 32 32 228 228 312500 312500 1371 1371 7 7 1 1
sparse 16 16 114 114 625000 625000 5482 5482 49 49 1 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM