简体   繁体   中英

How to create a min heap of fixed size using std::priority_queue?

I can define a min heap as:

priority_queue<int, vector<int>, greater> pq;

I have a stream of integers. The min heap's size is a fixed value k. It seems that priority_queue can not do this.

If you want to use std::priority_queue , it's trivial to limit the size of the queue to k elements. Note however that you need to use a max heap, not a min heap, because you need to know whether a newly arrived value should be inserted into the heap, which will happen if it is smaller than the maximum value currently in the heap.

class Topk {
  public:
    Topk(int k) : k_(k) {}
    void insert(int value) {
      if (q_.size() < k_) q_.push(value);
      else if (value < q_.top()) { q_.pop(); q_.push(value); }
    }
    std::vector<int> finalize() {
      std::vector<int> result(q_.size());
      while (q_.size()) {
        result[q_.size() - 1] = q_.top();
        q_.pop();
      }
      return result;
    }
  private:
    int k_;
    std::priority_queue<int> q_;
}

Just using the heap algorithms is really not more complicated:

class Topk {
  public:
    Topk(int k) : k_(k) {}
    void insert(int value) {
      if (c_.size() < k_) {
        c_.push_back(value);
        if (c_.size() == k_) make_heap(c_.begin(), c_.end());
      }
      else if (value < c_[0]) {
        /* See note below */
        pop_heap(c_.begin(), c_.end());
        c_.back() = value;
        push_heap(c_.begin(), c_.end());
      }
    }
    std::vector<int> finalize() {
      if (c_.size() < k_)
        std::sort(c_.begin(), c_.end());
      else
        sort_heap(c_.begin(), c_end());
      std::vector<int> c;
      std::swap(c, c_);
      return std::move(c);
    }
  private:
    /* invariant: if c_.size() == k, then c_ is a maxheap. */
    int k_;
    std::vector<int> c_;
}

Note: <algorithm> does not include a heap_sift_down operation, which is unfortunate for this application; the pop / swap / push operation could be replaced with swap / sift_down. That's still O(log k), but it is probably slightly faster.

If you have an iterator, and you don't need to do it in asynchronously, you can use std::partial_sort .

std::vector<int> x{10, 5, 1, 2, 3};
std::partial_sort(x.begin(), x.begin() + k, x.end());

This gives the first k elements with roughly O(nlogk) run-time complexity.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM