简体   繁体   English

编码霍夫曼二进制树

[英]Encoding a Huffman Binary tree

I am trying to write a function that takes in a huffman tree and a character. 我正在尝试编写一个接受霍夫曼树和角色的函数。 It should then encode the character and return it. 然后它应该编码字符并返回它。

The code so far: 到目前为止的代码:

string encode(NodePtr root, char letter)
{
    string encode_str; //a string which will store the encoded string
    NodePtr tempNode = root; //Initialize a new Huffman to be used in this function
    NodePtr tempLeft = root->left;
    NodePtr tempRight = root->right;

        //A while loop that goes on until we find the letter we want
        while((tempLeft->letter != letter) || (tempRight->letter != letter))
        {
         if((tempRight->is_leaf()) && (tempRight->letter == letter)) //check if is leaf and is letter
         {
                encode_str = encode_str + '1';
         }
         else if ((tempLeft->is_leaf()) && (tempLeft->letter == letter)) //check if is leaf and is letter
         {
             encode_str = encode_str + '0';
         }
         else if ((tempRight->is_leaf()) && (tempRight->letter != letter)) //check if is leaf and is NOT letter
         {
             tempNode = root->left;
             tempLeft = tempNode->left;
             tempRight = tempNode->right;
             encode_str = encode_str + '0';
         }
          else if ((tempLeft->is_leaf()) && (tempLeft->letter != letter)) //check if is leaf and is NOT letter
         {
             tempNode = root->right;
             tempLeft = tempNode->left;
             tempRight = tempNode->right;
             encode_str = encode_str + '1';
         }
         }    

    return encode_str;
}

This has not worked so far and debugging hasn't helped me either. 到目前为止,这还没有奏效,调试也没有帮助我。 Can anyone help me out here, or at least tell me if my thinking is right. 任何人都可以帮助我,或至少告诉我,我的想法是否正确。

If neither tempLeft nor tempRight is a leaf, you've got an infinite loop: 如果tempLeft和tempRight都不是叶子,那么你有一个无限循环:

while((tempLeft->letter != letter) || (tempRight->letter != letter))
    {
        if((tempRight->is_leaf()) &&
           (tempRight->letter == letter)) 
        {
            // no
        }
        else if ((tempLeft->is_leaf()) &&
                 (tempLeft->letter == letter)) 
        {
            // no
        }
        else if ((tempRight->is_leaf()) && 
                 (tempRight->letter != letter)) 
        {
            // no
        }
        else if ((tempLeft->is_leaf()) && 
                 (tempLeft->letter != letter))
        {
            // no
        }
     }

There must be something you intend to do in the case where the nodes are not leaves. 在节点不是叶子的情况下,必须要做某些事情。 Maybe recurse? 也许是递归?

(Per comments) You might be working with a variant of Huffman trees in which you can guarantee that every node is either a leaf or has one leaf child. (每条评论)您可能正在使用霍夫曼树的变体,您可以保证每个节点都是叶子或有一个叶子。 If you can guarantee that, then the above does not matter (it would be good to throw an exception if it occurs). 如果你可以保证,那么上面的内容并不重要(如果发生异常会抛出异常)。 However, real-world Huffman trees do not have this property. 然而,现实世界的霍夫曼树没有这个属性。


When one child is a leaf and the other is not your target letter, you attempt to set a new tempNode , tempLeft and tempRight for the next go around the loop. 当一个孩子是一个叶子而另一个孩子不是你的目标字母时,你会尝试为下一个环路设置一个新的tempNodetempLefttempRight

    else if ((tempRight->is_leaf()) && 
             (tempRight->letter != letter)) 
     {
         tempNode = root->left;
         tempLeft = tempNode->left;
         tempRight = tempNode->right;
         encode_str = encode_str + '0';
     } 

However, since you never modify root , tempNode = root->left will always set tempNode to the same node. 但是,由于您从不修改root ,因此tempNode = root->left将始终将tempNode设置为同一节点。

You probably want tempNode = tempNode->left . 你可能想要tempNode = tempNode->left


To avoid repetition of code, you can move 为避免重复代码,您可以移动

tempLeft = tempNode->left;
tempRight = tempNode->right;

... to be the first thing that happens in the while() loop. ...成为while()循环中发生的第一件事。


You say that debugging hasn't helped. 你说调试没有帮助。 Have you actually run it in a debugger? 你真的在调试器中运行它吗?

Write a unit test that sets up your tree; 编写一个设置树的单元测试; validates that the tree actually contains what you intend it to; 验证树实际上包含您想要的内容; and calls this function with one letter. 并用一个字母调用此函数。 Decide how you think execution should proceed. 决定你认为执行应该如何进行。 Now run the code in a debugger, stepping through it. 现在在调试器中运行代码,逐步执行它。 When it stops doing what you think it should, you'll be able to reason about why. 当它停止做你认为应该做的事情时,你就能够解释原因。


A common way to implement Huffman Encoding is to have an array of leaf nodes, so you can reach the node through simple array access: 实现Huffman编码的常用方法是拥有一个叶节点数组,这样您就可以通过简单的数组访问来访问节点:

    NodePtr nodeA = nodes[0];

... and have a pointer to the parent in each node, and a field indicating whether it's the left or right child, so that you can easily traverse the tree backwards, from leaf to root, building up a code (in reverse): ...并且在每个节点中都有一个指向父节点的指针,以及一个指示它是左子节点还是右子节点的字段,这样您就可以轻松地从树到底遍历树,构建代码(反向):

    string code = "";
    NodePtr node = nodeA;
    while(node->parent != NULL) {
        code = node->code + code; 
        node = node->parent;
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM