简体   繁体   English

如何使用std :: priority_queue创建固定大小的最小堆?

[英]How to create a min heap of fixed size using std::priority_queue?

I can define a min heap as: 我可以将最小堆定义为:

priority_queue<int, vector<int>, greater> pq;

I have a stream of integers. 我有整数流。 The min heap's size is a fixed value k. 最小堆的大小为固定值k。 It seems that priority_queue can not do this. 似乎priority_queue无法做到这一点。

If you want to use std::priority_queue , it's trivial to limit the size of the queue to k elements. 如果要使用std::priority_queue ,将队列的大小限制为k元素很简单。 Note however that you need to use a max heap, not a min heap, because you need to know whether a newly arrived value should be inserted into the heap, which will happen if it is smaller than the maximum value currently in the heap. 但是请注意,您需要使用最大堆,而不是最小堆,因为您需要知道是否应将新到达的值插入到堆中,如果该值小于堆中当前的最大值,则会发生这种情况。

class Topk {
  public:
    Topk(int k) : k_(k) {}
    void insert(int value) {
      if (q_.size() < k_) q_.push(value);
      else if (value < q_.top()) { q_.pop(); q_.push(value); }
    }
    std::vector<int> finalize() {
      std::vector<int> result(q_.size());
      while (q_.size()) {
        result[q_.size() - 1] = q_.top();
        q_.pop();
      }
      return result;
    }
  private:
    int k_;
    std::priority_queue<int> q_;
}

Just using the heap algorithms is really not more complicated: 仅仅使用堆算法并没有那么复杂:

class Topk {
  public:
    Topk(int k) : k_(k) {}
    void insert(int value) {
      if (c_.size() < k_) {
        c_.push_back(value);
        if (c_.size() == k_) make_heap(c_.begin(), c_.end());
      }
      else if (value < c_[0]) {
        /* See note below */
        pop_heap(c_.begin(), c_.end());
        c_.back() = value;
        push_heap(c_.begin(), c_.end());
      }
    }
    std::vector<int> finalize() {
      if (c_.size() < k_)
        std::sort(c_.begin(), c_.end());
      else
        sort_heap(c_.begin(), c_end());
      std::vector<int> c;
      std::swap(c, c_);
      return std::move(c);
    }
  private:
    /* invariant: if c_.size() == k, then c_ is a maxheap. */
    int k_;
    std::vector<int> c_;
}

Note: <algorithm> does not include a heap_sift_down operation, which is unfortunate for this application; 注意: <algorithm>不包括heap_sift_down操作,这对于此应用程序是不幸的。 the pop / swap / push operation could be replaced with swap / sift_down. pop / swap / push操作可以替换为swap / sift_down。 That's still O(log k), but it is probably slightly faster. 那仍然是O(log k),但是可能稍微快一点。

If you have an iterator, and you don't need to do it in asynchronously, you can use std::partial_sort . 如果您有一个迭代器,并且不需要异步进行,则可以使用std::partial_sort

std::vector<int> x{10, 5, 1, 2, 3};
std::partial_sort(x.begin(), x.begin() + k, x.end());

This gives the first k elements with roughly O(nlogk) run-time complexity. 这使前k个元素的运行时复杂度大致为O(nlogk)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM