简体繁体 English

比较B + tree实现：将内部节点存储在磁盘上

[英]compare B+tree implementation: storing internal nodes on disk

原文 2016-03-13 00:54:42 2 1 data-structures/ filesystems/ b-tree/ bcache

is there any implementation where internal nodes of B+tree is also stored on disk? 是否有将B + tree的内部节点也存储在磁盘上的实现方式？ I am just wondering if any one is aware of such an implementation or see real advantage doing it this way? 我只是想知道是否有人知道这样的实现或看到这样做的真正优势？ Normally, one stores the leaf nodes on disk and develop the B+ tree as per need. 通常，将叶子节点存储在磁盘上，并根据需要开发B +树。

But it is also possible to save the current state of B+tree's internal nodes (by replacing the pointers by disk block number it points to): I see there are other challenges like keeping the internal nodes in memory in sync with the disk blocks: but the B+ tree may be implemented on nvram or say battery backed dram or some other method to keep it in sync. 但是也可以保存B + tree内部节点的当前状态（通过将指针替换为其指向的磁盘块号）：我看到了其他挑战，例如将内存中的内部节点与磁盘块保持同步：但是B +树可以在nvram上实现，或者说由电池支持的dram或其他使其保持同步的方法来实现。

Just wondering if anyone has already implemented it this way like linux's bcache or another implementation? 只是想知道是否有人已经像Linux的bcache或其他实现那样实现了它？

cheers, cforfun! 干杯，喝彩！

1 个解决方案

All persistent B+Tree implementations I've ever seen - as opposed to pure 'transient' in-memory structures - store both node types on disk. 我见过的所有持久性B + Tree实现（与纯“瞬态”内存结构相反）都将两种节点类型都存储在磁盘上。

Not doing so would require scanning the all the data (the external nodes, aka 'sequence set') on every load in order to rebuild the index, something that is feasible only when you're dealing with piddling small amounts of data or very special circumstances. 否则，将需要扫描每个负载上的所有数据（外部节点，也称为“序列集”）以重建索引，这只有在处理少量数据或非常特殊的数据时才可行情况。

I've seen single-user implementations that sync the disk image only when the page manager ejects a dirty page and on program shutdown, which has the effect that often-used internal nodes - which are rarely replaced/ejected - can go without sync-to-disk for a long time. 我见过单用户实现，仅当页面管理器弹出脏页面并在程序关闭时才同步磁盘映像，这样的结果是，经常使用的内部节点（很少替换/弹出）可以在不同步的情况下进行-磁盘很长时间。 This is somewhat justified by the fact that internal ('index') nodes can be rebuilt after a crash, so that only the external ('data') nodes need the full fault-tolerant persistence treatment. 可以在崩溃后重建内部（“索引”）节点的事实在某种程度上证明了这一点，因此只有外部（“数据”）节点需要完整的容错持久性处理。 The advantage of such schemes is that they eliminate the wasted writes for nodes close to the root whose update frequency is fairly high. 这种方案的优点是，它们消除了更新频率相当高的接近根的节点的写浪费。 Think SSDs, for example. 以固态硬盘为例。

One way of increasing disk efficiency for persisted in-memory structures is to persist only the log to disk, and to rebuild the whole tree from the log on each restart. 提高持久内存结构的磁盘效率的一种方法是仅将日志持久保存到磁盘，并在每次重新启动时从日志中重建整个树。 One very successful Java package uses this approach to great advantage. 一个非常成功的Java软件包使用这种方法取得了很大的优势。