简体   繁体   中英

Why does the vector iterator point out of bounds?

I'm studying a recursive mergesort algorithm, and one iterator goes out of bounds. I'm positive the root of my problem is that my algorithm is flawed, but I've spent days pouring over it, and I just don't see my misstep. I don't know what direction to take. Can someone more experienced/smarter than I take a look? (Full program with driver is available on Github here .)

Output is:

before: 50 5 40 10 30 15 20 20 10 25 
after : -1808873259 5 10 10 15 20 20 25 30 40 50 
/*      ^  
 *      Extra recursive call, and out-of-bounds.
 */

To be clear, I am constrained to returning a vector of type T, in this case int, but I'm aware from this post that using a void function is better.

template <typename T>
vector<T> mergesort(typename vector<T>::iterator begin, typename vector<T>::iterator end){
    vector<T> newVector;
    if (begin!=end){
        vector<T> tmp1;
        vector<T> tmp2;
        typename vector<T>::iterator mid1 = begin;
        typename vector<T>::iterator mid2 = begin;

        long origDistance = distance(begin,end);
        long endOfRange1 = origDistance/2;
        long begOfRange2 = endOfRange1+1;

        advance(mid1,endOfRange1);
        advance(mid2,begOfRange2);

        tmp1 = mergesort<T>(begin,mid1);
        tmp2 = mergesort<T>(mid2,end);

        //"merge()" is from the STL, link in comments. 
        merge(tmp1.begin(),tmp1.end(),tmp2.begin(),tmp2.end(), back_inserter(newVector));

    } else {
        newVector.push_back(*begin);
    }
    return newVector;
}

You dereference begin when begin == end . That's undefined behavior. Probably you want if (origDistance == 1) then push_back the single element and return.

Your function looks like it could work, if end points to the last element of the vector. However in your sample program you call it like this:

 newVector = mergesort<int>(vec.begin(), vec.end()); 

The vec.end() points past the end of the vector, it doesn't point to the last element. So your function messes up because it ultimately tries to access the element pointed to by the second iterator you pass in.

You could call your function like: mergesort<int>(vec.begin(), vec.end() - 1); .

However this will surprise anyone else reading your code. It would be much better to rewrite your mergesort function to follow normal C++ range conventions, that is, the parameter named end should be past-the-end. mid1 should equal mid2 .

Okay - couldn't go to sleep without figuring this out, and huge credit goes to John Zwinck and MM for getting me in the right direction - here's the code which got the correct output:

template <typename T>
vector<T> mergesort(typename vector<T>::iterator begin, typename vector<T>::iterator end){
    vector<T> newVector;
    long origDistance = distance(begin,end); /*Get distance first.*/

    if (origDistance==1){ /*Added better anchor case checking for distance.*/
        newVector.push_back(*begin);
        return newVector;
    }

    vector<T> tmp1;
    vector<T> tmp2;
    typename vector<T>::iterator mid1 = begin;
    typename vector<T>::iterator mid2 = begin;

    long endOfRange1 = origDistance/2;
    long begOfRange2 = endOfRange1;/*Edited from: endOfRange+1*/

    advance(mid1,endOfRange1);
    advance(mid2,begOfRange2);

    tmp1 = mergesort<T>(begin,mid1);
    tmp2 = mergesort<T>(mid2,end);

    merge(tmp1.begin(),tmp1.end(),tmp2.begin(),tmp2.end(), back_inserter(newVector));
        return newVector;
}

Here I will show you how to do it.

template <typename T>
void mergesort(typename vector<T>::iterator, typename vector<T>::iterator);

// ...

    mergesort<int>(vec.begin(), vec.end());
    newVector = vec;

// ...

template <typename T>
void mergesort(typename vector<T>::iterator begin, typename vector<T>::iterator end){
    auto const N = std::distance(begin, end);
    if (N <= 1) return;                   
    auto const middle = std::next(begin, N / 2);
    mergesort<T>(begin, middle);
    mergesort<T>(middle, end);
    std::inplace_merge(begin, middle, end); 
}

Correct output:

before: 50 5 40 10 30 15 20 20 10 25 
after : 5 10 10 15 20 20 25 30 40 50 

STL already has inplace_merge so why reimplement it? With this approach you don't have to think has hard with boundaries.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM