简体   繁体   English

使用优先级队列C ++从H(n ^ 2)到O(n)优化霍夫曼树算法

[英]Optimizing the Huffman tree algorithm from O(n^2) to O(n) with priority queue C++

I've been given this code snippet for organizing a Huffman tree. 已为我提供了用于组织霍夫曼树的代码段。

// Build a Huffman tree from a collection of frequencies
template <typename E> HuffTree<E>*
buildHuff(HuffTree<E>** TreeArray, int count) {
    heap<HuffTree<E>*,minTreeComp>* forest = 
        new heap<HuffTree<E>*, minTreeComp>(TreeArray, count, count); 
    HuffTree<char> *temp1, *temp2, *temp3 = NULL;
    while (forest->size() > 1) {
        temp1 = forest->removefirst();   // Pull first two trees  
        temp2 = forest->removefirst();   //   off the list
        temp3 = new HuffTree<E>(temp1, temp2);
        forest->insert(temp3);  // Put the new tree back on list
        delete temp1;        // Must delete the remnants
        delete temp2;        //   of the trees we created
    }
    return temp3;
}

It's a pretty typical implementation (ignoring the poor templatization and obvious memory leak). 这是一个非常典型的实现(忽略较差的模板化和明显的内存泄漏)。

I'm supposed to revise this algorithm so that it operates O(n) instead of O(n^2) using a priority queue. 我应该修改此算法,以便它使用优先级队列来操作O(n)而不是O(n ^ 2)。 I'm not exactly sure how to implement this, but I'm guessing somewhere along the lines of this: 我不确定如何实现此功能,但是我猜测可能与此类似:

template <typename E> 
HuffTree<E>* buildHuff(HuffTree<E>** TreeArray, int count) {
    PriorityQueue<HuffTree<E>*, MIN_SORT> forest(count);
    for(int i = 0; i < count; i++) {
        forest.enqueue(TreeArray[i], TreeArray[i]->weight());
    }

    HuffTree<E> *tree = NULL;
    HuffTree<E> *left, *right = NULL;
    while(forest.size() > 0) {
        left = forest.dequeue();
        if (tree) {
            right = tree;
        }
        else {
            right = forest.dequeue();
        }
        tree = new HuffTree<E>(left, right);
        delete left;
        delete right;
    }
    return tree;
}

But it doesn't work. 但这是行不通的。

For the sake of brevity, I didn't include the referenced classes, but they're implementation is pretty straight forward. 为了简洁起见,我没有包括所引用的类,但是它们的实现非常简单。 I would appreciate any advice to help steer me in the right direction. 我将不胜感激任何指导我朝正确方向发展的建议。

Your implementation always selects the just-created tree as one of the children for the next tree. 您的实现始终选择刚创建的树作为下一棵树的子树之一。 That's not correct. 那是不对的。 Consider the (ordered) frequencies: 考虑(有序)频率:

1, 1, 1, 1, 3

The first two will be combined to produce a node with frequency 2, but the correct second node will not include that node. 前两个将合并以产生一个频率为2的节点,但正确的第二个节点将不包含该节点。


I don't see how you can use a priority queue to make the solution O(n), since the priority queue requires O(log n) to remove the minimum element. 我看不到如何使用优先级队列将解决方案设为O(n),因为优先级队列需要O(log n)才能删除最小元素。 (It can be built in O(n), but not the way you do it.) (它可以内置在O(n)中,但不能以您的方式构建。)

If you're going to use an O(n log n) algorithm anyway, it's easier to just sort the frequencies in the first place. 如果您仍要使用O(n log n)算法,则将频率放在首位会更容易。 No further sorting needs to be done because the nodes which are produced are produced with monotonically non-decreasing frequencies, so there is no need for a priority queue to keep them sorted. 不需要进行进一步的排序,因为生成的节点以单调非递减的频率生成,因此不需要优先级队列来对其进行排序。 What you need is to (incrementally) merge the sorted leaves and the (sorted as they are produced) non-leaves. 您需要的是(逐步)合并排序后的叶子和(生成时排序的)非叶子。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM