简体   繁体   中英

Heap Buffer Overflow from a 2d array

so I am new to C programming and allocating memory. So I have written a program that does matrix multiplication. I have allocated memory for the 1d array within the 2d array for matrix 1 and same with matrix 2. Below is my code and I do not understand why I am getting a heap buffer overflow. Input contains a file that contains dimensions and components of both matrixes. An example file format might contain the following format

    3       3
    1       2       3
    4       5       6
    7       8       9
    3       3
    1       2       3
    4       5       6
    7       8       9 

The first line 3 and 3 would mean 3 rows and 3 columns of matrix 1. Hence when reading it from the file it would be stored in rows1 and columns1. Next, 1-9 would be contained in the first matrix. 3 and 3 would be 3 rows of matrix 2 and 3 columns of matrix 2. Hence it would be stored in rows2 and columns2. All these numbers are separated by tabs. The above file was the one of many I tested and it got my a heap buffer overflow.

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
void print(int** square, int rows,int columns);

int main(int argc, char** argv) {
    FILE *fp = fopen(argv[1], "r");
    if (fp == NULL) {
        printf("error\n");
        return 0;
    }
    int rows1 = 0; int columns1 = 0; int num = 0;
    fscanf(fp, "%d", &rows1);
    fscanf(fp, "%d", &columns1);

    int** square = (int**) malloc(sizeof(int*) * rows1);
    for (int i = 0; i < rows1; i++) {
        square[i] = (int*) malloc(sizeof(int) * columns1);
    }
    for (int i = 0; i < rows1; i++) {
        for (int j = 0; j < columns1; j++) {
            fscanf(fp, "%d", &num);
            square[i][j] = num;
        }
    }
    int rows2 = 0; int columns2; int num2 = 0;
    fscanf(fp, "%d", &rows2);
    fscanf(fp, "%d", &columns2);

    int** square2 = (int**) malloc(sizeof(int*) * rows2);
    for (int i = 0; i < rows2; i++) {
        square2[i] = (int*) malloc(sizeof(int) * columns2);
    }
    for (int i = 0; i < rows2; i++) {
        for (int j = 0; j < columns2; j++) {
            fscanf(fp, "%d", &num2);
            square2[i][j] = num2;
        }
    }
    if (columns1 != rows2) {
        printf("bad-matrices\n");
        return 0;
    }
    int ans = 0;
    int** answer = (int**) malloc(sizeof(int*) * rows1);
    for (int i = 0; i < rows1; i++) {
        answer[i] = (int*) malloc(sizeof(int) * columns2);
    }
    for (int i = 0; i < rows1; i++) {
        for (int j = 0; j < columns2; j++) {
            for (int k = 0; k < rows2; k++) {
                ans += square[i][k] * square2[k][j];
            }
            answer[i][j] = ans;
            ans = 0; 
        }
    }
    print(answer, rows1, columns2);
    fclose(fp);
    return 0;
}

void print(int** square, int rows, int columns) {
    for (int i = 0; i < rows; i++) {
        for (int j = 0; j < columns; j++) {
            printf("%d\t", square[i][j]);
        }
        printf("\n");
    }
    return;
}

Outcome:

==31599== ERROR: AddressSanitizer: heap-buffer-overflow on address..... 

"heap-buffer-overflow" means that you created a buffer of a certain size, but tried to access beyond the bounds of the buffer. This normally means that either you have a loop that's using the wrong value for an upper bound, or that one of your buffers is not actually the size that you think it is.

It's hard to tell for sure what's going on here. The code copy/pasted into my gcc appears to work as expected (I don't have access to AddressSanitizer at the moment though). The first thing I noticed about your code was that it uses values read from the input file both for buffer sizes and for loop bounds without any sort of sanity checking. My recommendation is to step through this code in your debugger and make sure that the values that get read from disk and the computed buffer sizes are all what you expect them to be. All it takes is for one of those scanf() calls to encounter something unexpected, return zero, and throw all of your computations off.

Also, it might be useful if you include the entire output of the compiler's error message (dont' forget to compile in debug mode). The AddressSanitizer output normally includes a stack trace that can point you to the line number where the problem occurred. Also useful would be the name and version number of your compiler, plus whatever command-line options you're using.

Using malloc

First, your code is fine, but that doesn't meant is doesn't contain problems. First, let's look at your use of malloc , eg

int** answer = (int**) malloc(sizeof(int*) * rows1);

There is no need to cast the return of malloc , it is unnecessary. See: Do I cast the result of malloc? . Further, and this is more style than anything else, the '*'s showing the levels of indirection go with the variable not the type. Why?

int* a, b, c;

That certainly does not declare 3-pointers to int. It declares a single pointer and two integers, eg

int *a, b, c;

When setting the type-size for the allocation, if you always use the dereferenced pointer itself, you will never get your type-size wrong, eg

int **answer = malloc (rows1 * sizeof *answer);

If Allocate it, You Must Validate It, & It's Up To You to free it

For every allocation, you should check that the pointer returned by malloc, calloc, realloc is not NULL . Allocation functions do fail when you run out of memory. Always check.

In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.

It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.

For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.

Always confirm that you have freed all memory you have allocated and that there are no memory errors.

Simply declare a function to free your pointer arrays, and pass each to the free function along with the row-count before your program exits, eg

void freearr (int **a, int rows)
{
    for (int i = 0; i < rows; i++)
        free (a[i]);
    free (a);
}

and

...
fclose(fp);

freearr (square, rows1);
freearr (square2, rows2);
freearr (answer, rows1);

return 0;

Why Do I Get: ERROR: AddressSanitizer: heap-buffer-overflow on address.....?

This is more a result of your compiler telling you to double-check your use of array bounds. Specifically here it most likely results from:

int answer = malloc (rows1 * sizeof *asnwer);
for (int i = 0; i < rows1; i++)
    answer[i] = malloc (columns2 * sizeof *answer[i]);

for (int i = 0; i < rows1; i++) {
    for (int j = 0; j < columns2; j++) {
        for (int k = 0; k < rows2; k++) {
            ans += square[i][k] * square2[k][j];
        }
        answer[i][j] = ans;

Note: how answer is sized using the bounds of rows1 and columns2 , while square is allocated using rows1, columns1 and square2 with rows2, columns2 . Your compiler can help you spot potential heap overflow by keeping track of the variables used to size the allocation. Some compilers are better than others at this.

If the compiler cannot determine that the limits you are using to iterate over your array, it can throw the warning about potential buffer overflow. (all it should care about is the value of the limits used, but like I said, some compilers are better than others...)

After allocating with the limits set out above, you then proceed to iterate over the pointer arrays with different limits that were read into separate and unrelated variables. Using rows1, columns2 to iterate over square, square2 & answer . Think about it, while you know columns1 == columns2 , then compiler has no guarantee of that. Same for rows2 == rows1 .

Your compiler has no guarantee that using rows1 with square2 won't write beyond its allocated size. Likewise it has no guarantee that using columns2 won't violate the bounds of square . Your test of columns1 != rows2 doesn't provide any guarantee for rows1 == columns2 or rows1 == rows2 , etc...

So white all of the limits used are fine -- your compiler cannot guarantee it and warns. However, since you tediously picked though your code to know your limits are good, all it takes is a fraction of a second to confirm it, eg

 $ valgrind ./bin/read2darrq dat/arr_2-3x3.txt
==29210== Memcheck, a memory error detector
==29210== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==29210== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==29210== Command: ./bin/read2darrq dat/arr_2-3x3.txt
==29210==
90      96      102
216     231     246
342     366     390
==29210==
==29210== HEAP SUMMARY:
==29210==     in use at exit: 0 bytes in 0 blocks
==29210==   total heap usage: 13 allocs, 13 frees, 732 bytes allocated
==29210==
==29210== All heap blocks were freed -- no leaks are possible
==29210==
==29210== For counts of detected and suppressed errors, rerun with: -v
==29210== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM