简体繁体 English

展开树现实生活中的应用

[英]Splay tree real life applications

原文 2017-12-05 13:15:00 5 2 algorithm/ data-structures/ tree/ binary-search-tree

Where would you use splay-tree in production.您会在生产中的何处使用 splay-tree。 I mean a REAL LIFE example.我的意思是一个真实的例子。

I was thinking about implementing autocomplete using tries and splay trees.我正在考虑使用尝试和展开树来实现自动完成。 For a large dataset it's not a good idea to traverse through trie from node x to the leaves to return results, so the idea was of having a splay tree inside a node in trie, so when user entered 'sta' it will go to sta, 'a' - node and then return the top 5 elements in the splay tree (by BFS/level traversing, which doesn't necessarily mutates/modifies the tree)对于大型数据集，从节点 x 到叶子遍历trie 以返回结果并不是一个好主意，所以这个想法是在trie 中的节点内有一个展开树，所以当用户输入“sta”时，它会转到sta , 'a' - 节点，然后返回 splay 树中的前 5 个元素（通过 BFS/级别遍历，不一定会变异/修改树）

Of course after the autocomplete variant was picked, we should traverse up the trie and update all splay trees inside those nodes.当然，在选择自动完成变体之后，我们应该遍历树并更新这些节点内的所有展开树。

Since splay trees are sensitive in concurrent environments I was questioning its' usage in production由于张开树在并发环境中很敏感，我质疑它在生产中的使用

Your ideas?你的想法？

2 个解决方案

Splay trees are not a good match for data which rarely or never changes, particularly in a threaded environment.展开树不太适合很少或从不更改的数据，尤其是在线程环境中。 The extra mutations during read operations defeat memory caches and can create unnecessary lock contention.读取操作期间的额外突变会破坏内存缓存，并可能造成不必要的锁争用。 In any case, for read-only data structures, you can do a one-time computation of an optimal tree.在任何情况下，对于只读数据结构，您都可以对最优树进行一次性计算。 Even if that computation is slow, it will have no impact on the long-term execution time.即使该计算很慢，也不会影响长期执行时间。

I'm not entirely persuaded by the claim that large tries are slow, and certainly not in the case of autocompleters.我并不完全相信大型尝试很慢的说法，当然在自动完成程序的情况下也不会。 On even not-so-modern hardware, the cost of a trie traversal is trivial compared to the time it takes for the user to type a character, or even the time it takes for the underlying keyboard driver and input processor to deliver the keypress to your application.即使在不那么现代的硬件上，与用户键入字符所花费的时间相比，甚至与底层键盘驱动程序和输入处理器将按键传递到所需的时间相比，trie 遍历的成本也微不足道。你的申请。

If you really need to optimise a trie, there is good reason to believe that a hybrid data structure with a trie at the root combined with a linear (or binary) search once the alternatives can fit in a cache line.如果您真的需要优化一个特里树，那么有充分的理由相信，一旦替代项可以放入缓存行中，就可以将根为特里树的混合数据结构与线性（或二分）搜索相结合。 This maximizes the benefit of the trie's large fan-out while avoiding the poor caching behaviour and excessive storage overhead at the end of the lines.这最大限度地发挥了特里大扇出的好处，同时避免了糟糕的缓存行为和行尾的过多存储开销。

Splay trees are most useful (if they are useful at all) on data structures which are modified frequently.展开树在经常修改的数据结构上最有用（如果它们真的有用的话）。 The ckassic example is a "rope" data structure (a tree of string segments), which is one way to attempt to optimise a text editor by avoiding large string copies. ckassic 示例是“绳索”数据结构（字符串段树），这是尝试通过避免大字符串副本来优化文本编辑器的一种方法。 Compared with a deterministic tree-balancing algorithm such as RB-trees, the splay tree algorithm has the benefit of simplicity, as well as only touching nodes which form part of the tree traversal.与确定性树平衡算法（例如 RB 树）相比，展开树算法具有简单的优点，并且仅接触构成树遍历的一部分的节点。

However, the ready availability of self-balancing tree libraries (part of the standard libraries of many modern programming languages) combined with often-disappointing empirical results make the splay algorithm a niche product at best, although it is certainly a fascinating idea.然而，自平衡树库（许多现代编程语言的标准库的一部分）的现成可用性与经常令人失望的经验结果相结合，使 splay 算法充其量只是一个小众产品，尽管它确实是一个引人入胜的想法。

I found a quite interesting usage of splay trees in Network load optimisations, it's called SplayNet.我在网络负载优化中发现了一个非常有趣的 splay 树用法，它被称为 SplayNet。 A Autonomous System (I think under Facebook) has implemented this maybe around 2015 and they have somehow managed with this to lower their internal communication load by around 40%(?).一个自治系统（我认为在 Facebook 下）可能在 2015 年左右实现了这一点，他们以某种方式设法将其内部通信负载降低了约 40%（？）。 So there is a good usage for Splaytrees!所以Splaytrees有一个很好的用法！

Few weeks ago i was also reading about Splaytrees being usefully depending on the spread in the sequence of Search.几周前，我还阅读了有关 Splaytrees 取决于搜索序列中的传播的有用信息。 If there is none you could also use fa binary trees or some static trees.如果没有，你也可以使用二叉树或一些静态树。 But in the moment there is one, Splaytrees perform (if you use unlimited time) better.但是现在有一个，Splaytrees 表现得更好（如果你使用无限时间）。

In my thesis I use splay trees as pre processed data collection for the actual searching.在我的论文中，我使用张开树作为实际搜索的预处理数据集合。 So the splay tree only stores the results of the most common search requests.所以 splay 树只存储最常见的搜索请求的结果。 In the next step the search starts from the splay tree given node ... I think this is useful for big datasets, specially if it's stored on different computers/storages, so your program has a better guess where to start.在下一步中，搜索从给定节点的展开树开始......我认为这对大数据集很有用，特别是如果它存储在不同的计算机/存储上，因此您的程序可以更好地猜测从哪里开始。 To say it the easy way - my splaytrees stores the FAQ of the given datastructure/dataset :)简单地说 - 我的 splaytrees 存储给定数据结构/数据集的常见问题解答:)