简体   繁体   中英

High performance heap sorting

I have a vector of size more than 5 million, each time I would like to pick up one element with the smallest key from the vector, and do some process on this element. However with the process this particular element, all of the remaining elements in the vector will also be affected so that their key update. So next time if I want to pick up the element with the smallest key from the vector, I must sort the vector once again. The problem is the number of picking-up the smallest element from the vector will be as high as 0.5 million, so that the program runs so slow. For your clearer understanding, I could write the following code to illustrate:

void function(vector<MyObj*>& A)
{ //A.size() is near 5 million, maybe even more such as 50 million.
    make_heap(A.begin(), A.end(), compare); // compare function is self-defined.
    for (int i=0; i<500000; i++)
    {
        MyObj* smallest_elem = A.front();
        pop_heap(A.begin(), A.end());
        A.pop_back();
        Process_MyObj(smallest_elem); // here all of the elements 
                                      // in A will be affect, causing 
                                      // their keys changed.

        make_heap(A.begin(), A.end()); // Since all elements' keys in A changed,
                                       // so heap sorting A once again is 
                                       // necessary in my viewpoint.
    }
}

Is there any ways to make the code run as efficient as possible? Any idea is welcome, not limited improvement in algorithm, for example, parallel or anything else. Thank you so much!

If Process_MyObj is indeed affecting the keys of all the elements in A, I don't think there is much you can do. If it only modified some of the keys, you could write code to update individual elements in the heap.

As your code is now I don't see what you gain from building a heap. I would just do a linear scan to find the minimal element, swap it with the last one, and then pop the last element.

You could try sorting the vector and picking the elements in order, instead of using a heap.

It will not improve the big-o complexity, but it could possibly improve the constant factor.

How much of the time is in Process_MyObj , and how much in the heap operations -- 50 / 50 %, 80 / 20 % ?
This is important, becase you want to balance the two. Consider the following general setup:

Make a Todo list
Loop:
    work on items ...
    update the Todo list

Too much time updating the list means not enough time doing real work. So first measure the ratio Process / Heap time.
A cheap way to do this is to make a second run with Process_MyObj and compare done twice, eg

 P + H = 1.0 sec
2P + H = 1.7 sec
=> P = .7, H = .3: P / H = 70 % / 30 %.


make_heap runs in linear time -- see how-can-stdmake-heap-be-implemented-while-making-at-most-3n-comparisons -- so speedup there will be tough. If values are constant, a heap of 64-bit <32 value, 32 index> would be more cache-efficient than pointers.

whats-new-in-purely-functional-data-structures-since-okasaki on cstheory.stack lists dozens of papers, mostly theoretical, but one or two might be relevant to your problem.

Real speedups are almost always problem-specific, not general. Can you tell us more about the real problem ?


Added: If most pops are small, and pushes big, try putting a small cacheheap in front of the big sorted list. Pseudocode:

 push: push( cacheheap ) pop: return min( cacheheap, bigsortedlist ) 

This could be effective if cacheheap stays in the real cpu cache; ymmv.
(You might be able to cheat, and leave bigsortedlist inaccurate instead of sorting it every time.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM