简体   繁体   English

我不明白这个霍夫曼算法的实现

[英]I don't understand this Huffman algorithm implementation

    template<class T>
    void huffman(MinHeap<TreeNode<T>*> heap, int n)
    {
      for(int i=0;i<n-1;i++)
      {
        TreeNode<T> *first = heap.pop();
        TreeNode<T> *second = heap.pop();
        TreeNode<T> *bt = new BinaryTreeNode<T>(first, second, first.data, second.data);
        heap.push(bt);
      }
    }

In my Fundamentals of Data Structures in C++ textbook, it gave a 2 page definition of Huffman coding, and the code above. 在我的C ++数据结构基础教科书中,它给出了霍夫曼编码的2页定义,以及上面的代码。 To me, the book wasn't enough detailed, so I've done the googling and I learned how the process of Huffman coding works. 对我来说,这本书不够详细,所以我已经完成了谷歌搜索,我学会了霍夫曼编码的过程。 The textbook claims that at the end of the code above, a Huffman tree is made. 教科书声称在上面的代码末尾,制作了霍夫曼树。 But to me it seems wrong, because a Huffman tree, is not necessary a complete tree, but the code above seems to always give a complete tree because of the heap.push() . 但对我来说这似乎是错误的,因为一个霍夫曼树,不一定是一个完整的树,但由于heap.push()上面的代码似乎总是给出一个完整的树。 So can someone explain to me how this piece of code is not wrong? 那么有人可以向我解释这段代码是如何没有错的吗?

The heap's tree structure does not necessarily match the resulting Huffman tree -- rather, the heap contains a forest of partial Huffman trees, initially each consisting of a single symbol node. 堆的树结构不一定与得到的霍夫曼树匹配 - 相反,堆包含部分霍夫曼树的森林,最初每个树由单个符号节点组成。 The loop then repeatedly takes the two nodes with the least weight, combines them into one node, and puts the resulting combined node back. 然后,循环重复获取具有最小权重的两个节点,将它们组合成一个节点,并将得到的组合节点放回。 At the end of the process, the heap contains one finished tree. 在进程结束时,堆包含一个完成的树。

Huffman encoding works by taking the two lowest value items at each step. 霍夫曼编码通过在每一步取两个最低值项来工作。 When you first call the function (since your MinHeap is sorted by value) the two lowest value items are popped off and "combined" into a decision node which is then put back into the heap. 当您第一次调用该函数时(因为您的MinHeap按值排序),弹出两个最低值的项并“合并”到一个决策节点中,然后将该节点放回堆中。 That node is scored by the sum of its child scores and put back into the heap. 该节点按其子分数的总和进行评分并放回堆中。 Inserting it back into the heap puts it into the right place based on its score; 将其插回堆中会根据其得分将其放入正确的位置; if it's still lower than any other items it'll be first, otherwise it'll be somewhere else. 如果它仍然低于任何其他项目,它将是第一个,否则它将在其他地方。

So this algorithm is building the tree from the bottom up, and by the time you empty the heap you'll have a complete tree. 所以这个算法是从下到上构建树,当你清空堆时,你将拥有一个完整的树。 I don't understand what the 'n' is for, though; 但我不明白'n'是什么意思; the loop should be while (heap.size() > 1) . 循环应该是while (heap.size() > 1) Regardless, the tree is not "full", different branches will be different lengths depending on how the frequencies of the items in the initial heap are scored. 无论如何,树不是“满”,不同的分支将是不同的长度,这取决于初始堆中的项目的频率如何评分。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM