简体   繁体   English

从头开始实现树

[英]Implementing a tree from scratch

I'm trying to learn about trees by implementing one from scratch. 我试图通过从头开始实现一个来学习树。 In this case I'd like to do it in C# Java or C++. 在这种情况下,我想用C#Java或C ++来做。 (without using built in methods) (不使用内置方法)

So each node will store a character and there will be a maximum of 26 nodes per node. 因此每个节点将存储一个字符,每个节点最多有26个节点。

What data structure would I use to contain the pointers to each of the nodes? 我将使用什么数据结构来包含指向每个节点的指针?

Basically I'm trying to implement a radix tree from scratch. 基本上我正在尝试从头开始实现基数树。

Thanks, 谢谢,

What data structure would I use to contain the pointers to each of the nodes? 我将使用什么数据结构来包含指向每个节点的指针?

A Node. 一个节点。 Each Node should have references to (up to) 26 other Nodes in the Tree. 每个节点都应该引用树中的(最多)26个其他节点。 Within the Node you can store them in an array, LinkedList, ArrayList, or just about any other collection you can think of. 在Node中,您可以将它们存储在数组,LinkedList,ArrayList或您可以想到的任何其他集合中。

It doesn't really matter. 这并不重要。 You can use a linked list, an array (but this will have a fixed size), or a List type from the standard library of your language. 您可以使用链接列表,数组(但这将具有固定大小),或使用您的语言的标准库中的列表类型。

Using a List/array will mean doing some index book-keeping to traverse the tree, so it might be easiest to use just keep references to the children in the parent. 使用List /数组意味着要进行一些索引簿记来遍历树,因此最简单的方法是只使用对父项中子项的引用。

Here's one I found recently that's not a bad API for trees - although I needed graphs it was handy to see how it was set up to separate the data structure for the data it was holding, so you could have a tree-equivalent to Iterator to navigate through the tree, and so on. 这是我最近发现的一个不是一个糟糕的树API - 虽然我需要图表,看看如何设置它来分离它所持有的数据的数据结构是很方便的,所以你可以有一个等价于Iterator的树。浏览树,等等。

https://jsfcompounds.dev.java.net/treeutils/site/apidocs/com/truchsess/util/package-summary.html https://jsfcompounds.dev.java.net/treeutils/site/apidocs/com/truchsess/util/package-summary.html

If you are actually more interested in speed than space, and if each node represents exactly one letter (implied by your max of 26) then I'd just use a simple array of 26 slots, each referencing a "Node" (the Node is the object containing your array). 如果你实际上对速度比空间更感兴趣,并且如果每个节点只代表一个字母(隐含最多26个),那么我只使用一个简单的26个插槽阵列,每个插槽引用一个“节点”(节点是包含数组的对象)。

The nice thing about a fixed-sized array is that your look up would be much quicker. 固定大小的数组的好处是你的查找速度会快得多。 If you were looking up char "c" that was already guaranteed to be a lower cased letter, the look up would be as easy as: 如果你查找的char“c”已经保证是一个较低的套装字母,那么查找就像下面这样简单:

nextNode=nodes[c-'a'];

A recursive lookup of a string would be trivial. 对字符串的递归查找将是微不足道的。

What you describe isn't quite a radix tree... in a radix tree, you can have more than one character in a node, and there is no upper bound on the number of child nodes. 你描述的不是基数树......在基数树,你可以在一个节点的多个字符,且有上子节点的数量没有上限。

What you're describing sounds more limited by the alphabet... each node can be az, and can be followed by another letter, az, etc. The distinction is critical to the data structure you choose to hold your next-node pointers. 您所描述的内容听起来更受限于字母...每个节点可以是az,后面可以跟另一个字母,az等。这种区别对于您选择保存下一个节点指针的数据结构至关重要。

In the tree you describe, the easiest structure to use might be a simple array of pointers... all you need to do is convert the character (eg 'A') to its ascii value ('65'), and subtract the starting offset (65) to determine which 'next node' you want. 在您描述的树中,最简单的结构可能是一个简单的指针数组......您需要做的就是将字符(例如'A')转换为其ascii值('65'),然后减去起始值offset(65)确定你想要的“下一个节点”。 Takes up more space, but very fast insert and traversal. 占用更多空间,但插入和遍历非常快。

In a true radix tree, you could have 3, 4, 78, or 0 child nodes, and your 'next node' list will have the overhead of sorting, inserting, and deleting. 在真正的基数树中,您可以有3个,4个,78个或0个子节点,并且“下一个节点”列表将具有排序,插入和删除的开销。 Much slower. 慢得多。

I can't speak to Java, but if I were implementing a custom radix tree in C#, I'd use one of the built-in .NET collections. 我不能说Java,但如果我在C#中实现自定义基数树,我会使用其中一个内置的.NET集合。 Writing your own sorted list isn't really helping you learn the tree concepts, and the built-in optimizations of the .NET collections are tough to beat. 编写自己的排序列表并不能真正帮助您学习树概念,并且.NET集合的内置优化很难实现。 Then, your code is simple: Look up your next node; 然后,您的代码很简单:查找下一个节点; if exists, grab it and go; 如果存在,抓住它去; if not, add it to the next-node collection. 如果没有,请将其添加到下一个节点集合中。

Which collection you use depends on what exactly you're implementing through the tree... every type of tree involves tradeoffs between insertion time, lookup time, etc. The choices you make depend on what is most important to the application, not the tree. 您使用哪个集合取决于您通过树实现的具体内容......每种类型的树都需要在插入时间,查找时间等之间进行权衡。您所做的选择取决于对应用程序最重要的是什么,而不是树。

Make sense? 说得通?

Thanks for the quick replies. 感谢您的快速回复。

Yes was snogfish said was correct. 是的,snogfish说是正确的。 Basically, its a tree with 26 nodes (AZ) + a bool isTerminator. 基本上,它的树有26个节点(AZ)+一个bool是终结者。

Each each node has theses values and they are linked to each other. 每个节点都有这些值,它们相互链接。

I have not learned pointers in depth yet so my tries today to implement this from scratch using unsafe code in C# where futile. 我还没有深入学习指针,所以今天我尝试使用C#中的不安全代码从头开始实现这一点。

Therefore, I'd be grateful if someone could provide me with the code to get started in C# using the internal tree class. 因此,如果有人能够使用内部树类为我提供开始使用C#的代码,我将不胜感激。 Once I can get it started I can port the algorithms to the other languages and just change it to use pointers. 一旦我开始它,我可以将算法移植到其他语言,只需将其更改为使用指针。

Thanks very much, Michael 非常感谢,迈克尔

Check out this Simeon Pilgrim Blog, the " Code Camp Puzzle Reviewed ". 看看这个Simeon Pilgrim博客,“ Code Camp Puzzle评论 ”。 One of the solutions uses a Radix in C# and you can download the solution. 其中一个解决方案使用C#中的Radix,您可以下载解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM