简体   繁体   中英

Sorting Optimization

I'm currently following an algorithms class and thus decided it would be good practice to implement a few of the sorting algorithms and compare them. I implemented merge sort and quick sort and then compared their run time, along with the std::sort: My computer isn't the fastest but for 1000000 elements I get on average after 200 attempts:

  1. std::sort -> 0.620342 seconds
  2. quickSort -> 2.2692
  3. mergeSort -> 2.19048

I would like to ask if possible for comments on how to improve and optimize the implementation of my code.

void quickSort(std::vector<int>& nums, int s, int e, std::function<bool(int,int)> comparator = defaultComparator){
if(s >= e)
    return;

int pivot;
int a = s + (rand() % (e-s));
int b = s + (rand() % (e-s));
int c = s + (rand() % (e-s));

//find median of the 3 random pivots
int min = std::min(std::min(nums[a],nums[b]),nums[c]);
int max = std::max(std::max(nums[a],nums[b]),nums[c]);
if(nums[a] < max && nums[a] > min)
    pivot = a;
else if(nums[b] < max && nums[b] > min)
    pivot = b;
else
    pivot = c;

int temp = nums[s];
nums[s] = nums[pivot];
nums[pivot] = temp;

//partition
int i = s + 1, j = s + 1;
for(; j < e; j++){
    if(comparator(nums[j] , nums[s])){
        temp = nums[i];
        nums[i++] = nums[j];
        nums[j] = temp;
    }
}
temp = nums[i-1];
nums[i-1] = nums[s];
nums[s] = temp;

//sort left and right of partition
quickSort(nums,s,i-1,comparator);
quickSort(nums,i,e,comparator);

Here s is the index of the first element, e the index of the element after the last. defaultComparator is just the following lambda function:

auto defaultComparator = [](int a, int b){ return a <= b; };

std::vector<int> mergeSort(std::vector<int>& nums, int s, int e, std::function<bool(int,int)> comparator = defaultComparator){
std::vector<int> sorted(e-s);
if(s == e)
    return sorted;
int mid = (s+e)/2;
if(s == mid){
    sorted[0] = nums[s];
    return sorted;
}
std::vector<int> left = mergeSort(nums, s, mid);
std::vector<int> right = mergeSort(nums, mid, e);

unsigned int i = 0, j = 0;
unsigned int c = 0;
while(i < left.size() || j < right.size()){
    if(i == left.size()){
        sorted[c++] = right[j++];
    }
    else if(j == right.size()){
        sorted[c++] = left[i++];
    }
    else{
        if(comparator(left[i],right[j]))
            sorted[c++] = left[i++];
        else
            sorted[c++] = right[j++];
    }
}
return sorted;

Thank you all

The first thing I see, you're passing a std::function<> which involves a virtual call, one of the most expensive calling strategies. Give it a try with simply a template T (which might be a function) - the result will be direct function calls.

Second thing, never do this result-in-local-container ( vector<int> sorted; ) when optimizing and when in-place variant exists. Do in-place sort. Client should be aware of you shorting their vector; if they wish, they can make a copy in advance. You take non-const reference for a reason. [1]

Third, there's a cost associated with rand() and it's far from negligible. Unless you're sure you need the randomized variant of quicksort() (and its benefits regarding 'no too bad sequence'), use just the first element as pivot. Or the middle.

Use std::swap() to swap two elements. Chances are, it gets translated to xchg (on x86 / x64) or an equivalent, which is hard to beat. Whether the optimizer identifies your intend to swap at these places without being explicit could be verified from assembly output.

The way you found the median of three elements is full of conditional moves / branches. It's simply nums[a] + nums[b] + nums[c] - max - min ; but getting nums[...] , min and max at the same time could also be optimized further.

Avoid i++ when aiming at speed. While most optimizers will usually create good code, there's a small chance that it's suboptimal. Be explicit when optimizing ( ++i after the swap), but _only_when_optimizing_.

But the most important one: valgrind/callgrind/kcachegrind. Profile, profile, profile. Only optimize what's really slow.

[1] There's an exception to this rule: const containers that you build from non-const ones. These are usually in-house types and are shared across multiple threads, hence it's better to keep them const & copy when modification is needed. In this case, you'll allocate a new container (either const or not) in your function, but you'll probably keep const one for users' convenience on API.

For quick sort, use Hoare like partition scheme.

http://en.wikipedia.org/wiki/Quicksort#Hoare_partition_scheme

Median of 3 only needs 3 if / swap statements (effectively a bubble sort). No need for min or max check.

    if(nums[a] > nums[b])
        std::swap(nums[a], nums[b]);
    if(nums[b] > nums[c])
        std::swap(nums[b], nums[c]);
    if(nums[a] > nums[b])
        std::swap(nums[a], nums[b]);
    // use nums[b] as pivot value

For merge sort, use an entry function that does a one time creation of a working vector, then pass that vector by reference to the actual merge sort function. For top down merge sort, the indices determine the start, middle, and end of each sub-vector.

If using top down merge sort, the code can avoid copying data by alternating the direction of merge depending on the level of recursion. This can be done using two mutually recursive functions, the first one where the result ends up in the original vector, the second one where the result ends up in the working vector. The first one calls the second one twice, then merges from the working vector back to the original vector, and vice versa for the second one. For the second one, if the size == 1, then it needs to copy 1 element from the original vector to the working vector. An alternative to two functions is to pass a boolean for which direction to merge.

If using bottom up merge sort (which will be a bit faster), then each pass swaps vectors. The number of passes needed is determined up front and in the case of an odd number of passes, the first pass swaps in place, so that the data ends up in the original vector after all merge passes are done.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM