简体   繁体   English

如何递归遍历霍夫曼树以查找特定元素? - C

[英]How to traverse a Huffman Tree recursively in search for an specific element? - C

I'm working in a Huffman code, and I'm currently at the phase of encoding/decoding the text file into a binary file. 我正在使用霍夫曼代码,目前正处于将文本文件编码/解码为二进制文件的阶段。 I have this piece of code that retrieves a node from the tree along with all its relevant data (character, frecuency, route): 我有这段代码,它从树中检索一个节点及其所有相关数据(字符,频率,路由):

EmptyString ( string );
while ( ( c = fgetc ( nameTextFile ) ) != EOF ) {
    nodeHuffmanTree = SearchHuffmanTree ( rootHuffmanTree, c );
    strcpy ( string, nodeHuffmanTree -> route );
    Encode ( nameBinaryFile, string );
    EmptyString ( string );
}

Assume that the routes for each of these nodes (0's and 1's) have already been generated. 假设已经生成了每个这些节点(0和1)的路由。 What I want of the SearchHuffmanTree function is that, given a character, it searches for said character in the Huffman Tree and it returns me the node that contains it. 我想要的SearchHuffmanTree函数是,给定一个字符,它在Huffman树中搜索该字符,并返回包含该字符的节点。 This is relevant because that node will contain the route that the Encode function will convert into a byte. 这是相关的,因为该节点将包含Encode函数将转换为字节的路由。

I know that i can't treat the Huffman Tree like a Binary Search Tree because it doesn't share the same characteristics, so if I want to search for an specific character I'll have to traverse the whole tree. 我知道我不能像对待二叉搜索树一样对待霍夫曼树,因为它不具有相同的特征,因此如果我要搜索特定字符,则必须遍历整个树。

I've already looked for alternatives without using recursion (and a stack in some) and, althought they are easier to understand, they produce considerably less simple and clean looking codes, so I'd prefer solutions using recursion. 我已经在寻找不使用递归的替代方法(有些使用了栈),尽管它们更易于理解,但它们产生的代码却不那么简洁明了,因此我更喜欢使用递归的解决方案。

I've already figured out the encoding/decoding part, so this is pretty much the final step towards finally finishing up my code. 我已经弄清楚了编码/解码部分,所以这几乎是最终完成代码的最后一步。 Looking forward to any help you can give me. 期待您能给我任何帮助。

Yes, you cannot assume anything about the position of any specific node (ie character) in your tree, since the position of the nodes depends on the frequency of the characters and not on their values. 是的,您无法假设树中任何特定节点(即字符)的位置,因为节点的位置取决于字符的频率而不是其值。 Thus, you will have to find a way to traverse the whole tree, without making any assumptions. 因此,您将必须找到一种无需做任何假设即可遍历整个树的方法。

There are 2 ways of traversing a graph in general: breadth first search (BFS), which is based on a queue, and depth first search (DFS), which is based on a stack. 通常,有两种遍历图形的方式:基于队列的广度优先搜索(BFS)和基于堆栈的深度优先搜索(DFS)。

Since the DFS is based on a stack, it is an inherently recursive problem. 由于DFS基于堆栈,因此是一个固有的递归问题。 Also, due to the differences in the way the 2 approaches traverse the tree, DFS will be more efficient on average in your case. 此外,由于2种方法遍历树的方式不同,因此,在您的情况下,DFS平均效率更高。

How does DFS work? DFS如何工作?

Well, the basic principle is that if a node is not a leaf, perform a DFS on each of its children. 好吧,基本原理是,如果节点不是叶子,则对其每个子节点执行DFS。 If you choose the order in which the subtrees are traversed, you can take the highest probability path first, which increases your chances of finding the result faster. 如果选择遍历子树的顺序,则可以先采用最高概率路径,这样可以更快地找到结果。

Below is a simple pseudocode of the algorithm: 下面是该算法的简单伪代码:

DFS(node T, char x) {
    if (T is leaf)
        if (T == x)
            return found
        else
            return not found
    else
        foreach child of T
            if DFS(child, x) == found
                return found
        return not found

You can find more details on the Wikipedia page of DFS . 您可以在DFS的Wikipedia页面上找到更多详细信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM