Very Strange C++ Behavior

Question

I've come across a really weird bug in a C++ program I wrote for a school assignment (code pasted at the end), and I can't figure out why it's doing what it is. Particularly, it's sometimes giving randomly incorrect output, sometimes giving correct output, being run on the same input each time. If anyone might have insight as to why, I'd be very appreciative:

I've made a C++ program that has an implementation of a simple MaxHeap data structure which supports building the heap either using HeapInsert to insert elements into the heap one by one starting with an empty heap, and alternatively starting with an array of elements and using bubbledown on the first half of the elements to transform it into a heap -- the program takes one command-line argument, either HeapInsert which will use the first method of building a heap, or BubbleDown which will build the heap using the second method.

The program takes user input from cin: first the number of elements that will be given to make the heap out of, then the elements you want to put in the heap. After its done it outputs the number of exchanges performed in the bubbleup/bubbledown, and then the elements of the heap in order that they are in the array that stores the heap.

We've been given a sample input (of 100 random numbers) and a sample output that my code should produce in order to know our implementation is correct. I do the following on the command line:

g++ HeapTest.cpp
./a.out BubbleDown < 100.txt > out
diff out s100b.txt

100.txt is the sample input, s100b.txt is the correct sample output.

Executing the lines

./a.out BubbleDown < 100.txt > out
diff out s100b.txt

repeatedly, I get inconsistent results. It seems that half the time I get that my output matches the sample file completely, but half the time it does not, and particularly when I look at my output file, it looks like a random large number has been inserted into my heap for no reason, making my output wrong.

It makes absolutely no sense to me that the results would be inconsistent in running the code repeatedly with the exact same input. This only happens when I use the "BubbleDown" option on the command line. Below is my code:

#include <cstdlib>
#include <stdint.h>       
#include <iostream>
#include <string>
#include <cstring>
#include <cassert>
#include <cmath>
using namespace std;

struct MaxHeap { //MaxHeap data structure
    int n;      //size of the heap
    int numex;  //number of exchanges in building the heap
    int* A;     //Array storing the actual heap
    MaxHeap(int a){     //First Constructor: initializes an empty heap of size 0 in an array of size a
        n=0;    //initialize size to 0
        numex=0;//initialize numex to 0
        A = new int[a]; //allocate space for array of size A on heap
    }
    MaxHeap(int * data, int a){ //Second Constructor: consumes array of a elements and creates a heap
                            //out of thoses elements using bubbledown
        n = a;
        A = data;
        numex = 0;

        for(int k = (int)(floor((n-1)/2)); k > -1 ; k-=1){
            bubbledown(k);
        }
    }
    ~MaxHeap(){}    //necessary since MaxHeaps made with first constructor are non-contiguous
    void bubbleup(int v){//bubble-up algorithm as described in class
        int j;
        while( (v != 0) && (A[(int)(floor((v-1)/2))] < A[v]) ){
            numex +=1;
            j = A[v];
            A[v] = A[(int)(floor((v-1)/2))];
            A[(int)(floor((v-1)/2))] = j;
            v = (int)(floor((v-1)/2));

        }

    }
    void bubbledown(int v){//bubbledown algorithm as described in calss

        int j;
        int k;
        int L;
        int temp;
        while(true){

            j = 2*v+1;
            k = 2*v+2;
            if((j <= n) && (A[j] > A[v])){L = j;}
            else{L = v;}
            if((k <= n) && (A[k] > A[L])){L = k;}
            if(L == v){break;}
            else{numex +=1; temp = A[v]; A[v] = A[L]; A[L] = temp; v=L;}
        }

    }
    void HeapInsert(int i, int k){//heapinsert algorithm as described in class

        n=k+1;
        A[n-1] = i;
        bubbleup(n-1);

    }
};

void error(){

    cerr << "Usage: " << endl;
    exit(-1);

}

int main(int argc, char * argv[]){

    int flag;
    char hins[] = "HeapInsert";
    char bdwn[] = "BubbleDown";

    switch(argc){    
        case 2:
            if(strcmp(argv[1], hins) == 0){flag=0; break;}
            else if(strcmp(argv[1], bdwn) == 0){flag=1; break;}
            else{error();}
        default: error();
    }

    if(flag==0){//If HeapInsert option selected, the below creates a heap via HeapInsert

        int nelem;
        cin >> nelem;       //read in number of elements that are going to be given
        struct MaxHeap H = MaxHeap(nelem);  //call first constructor

        for(int k=0; k < nelem; k+=1){      //insert elements into the heap one by one as they are read in
            int i;
            cin >> i;
            H.HeapInsert(i,k);
        }

        cout << H.numex << endl;            //print number of exchanges

        for(int k =0;k < nelem; k+=1){      //print elements of heap 1 by 1

            cout << H.A[k] << endl;

        }

    }
    else{       //if BubbleDown option chosen by user

        int nelem;
        cin >> nelem;   //read in number of elements
        int data[nelem];    //initialize array to store that number of elements

        for(int k=0; k < nelem; k+=1){  //build array of elements in order given

            int i;
            cin >> i;
            data[k] = i;

        }

        struct MaxHeap H = MaxHeap(data, nelem);    //use second constructor to create a heap out of the array

        cout << H.numex << endl;            //print number of exchanges

        for(int k =0;k < nelem; k+=1){      //print out elements 1 by 1

            cout << H.A[k] << endl;

        }

    }
}

If anyone has any idea of how my code can be producing inconsistent results like this when it doesn't rely on any randomness or memory allocation (no memory allocation used when BubbleDown option is given), insight would be greatly appreciated!

Answer 1

I compiled your program with debug symbols...

gcc -g -O0 -o stuff stuff.cpp

And ran it in Valgrind ...

echo '4 2 3 4 5 6' | valgrind ./stuff BubbleDown

Here's what it said:

==28605== Conditional jump or move depends on uninitialised value(s)
==28605==    at 0x401186: MaxHeap::bubbledown(int) (stuff.cpp:52)
==28605==    by 0x400FCD: MaxHeap::MaxHeap(int*, int) (stuff.cpp:26)
==28605==    by 0x400E08: main (stuff.cpp:125)

Which seems to correspond to this:

if((j <= n) && (A[j] > A[v])){L = j;}

The problem seems to be that you're reading off the end of the array. If j == n , then it's one element past the end of the array. Same with k == n . If you change bubbledown to this, the problem goes away:

void bubbledown(int v){//bubbledown algorithm as described in calss
    while(true){
        const int j = 2*v+1;
        const int k = 2*v+2;
        int L;

        // notice < instead of <=
        if((j < n) && (A[j] > A[v])){
            L = j;
        }
        else{
            L = v;
        }

        // notice < instead of <=
        if((k < n) && (A[k] > A[L])){
            L = k;
        }
        if(L == v){
            break;
        }
        else{
            numex +=1;
            const int temp = A[v];
            A[v] = A[L];
            A[L] = temp;
            v = L;
        }
    }
}

Note: I used some Linux commands to do this (Valgrind most importantly). Whatever compiler toolchain / IDE you're using should have its own debugger which can presumably give you similar output. There's a Stack Overflow question about Valgrind substitutes in Windows . I suggest finding a tool like that you like -- It will make C++ debugging much easier.

Very Strange C++ Behavior

Question

1 answers

solution1
8 2011-10-11 23:13:00

Very Strange C++ Behavior

Question

1 answers

solution1 8 2011-10-11 23:13:00

solution1
8 2011-10-11 23:13:00