简体   繁体   中英

Difference between two methods of malloc

I want to create 5*5 2D Matrix. I usually use the following way of memory allocation:

int **M  = malloc(5 * sizeof(int *));
for (i = 0; i < 5; i++)
{
    M[i] =  malloc(5 * sizeof(int));
}

While I was reading a blog, I found also another way to do that:

int **M = malloc(5 * sizeof(int*));
M[0] = malloc((5*5) * sizeof(int));

My question is: What is the difference between both methods? Which one in more efficient?

For the second code, note that you need to initialize the other array members for it to work correctly:

    for (int i = 1; i < 5; i++) {
        M[i] = M[0] + i * 5;
    }

So in the second code the arrays members (through all arrays) are contiguous. It does not make any difference to access them (eg, you an still access them using M[i][j] syntax). It has the advantage over the first code to require only two malloc calls and as mentioned in the comments to favor caching which can greatly improve the access performances.

But if you plan to dynamically allocate large arrays, it is better to use the first method because of memory fragmentation (large contiguous memory allocation can be not available or can exacerbate memory fragmentation).

A similar example of this kind of dynamic allocation of arrays of arrays can be found in the c-faq: http://c-faq.com/aryptr/dynmuldimary.html

After seeing ouah's answer and seeing the example in the C FAQ, I now understand where the second technique comes from, although I personally wouldn't use it where I could help it.

The main problem with the first approach you show is that the rows in the array are not guaranteed to be adjacent in memory; IOW, the object immediately following M[0][4] is not necessarily M[1][0] . If two rows are allocated from different pages, that could degrade runtime performance.

The second approach guarantees that all the rows will be allocated contiguously, but you have to manually assign M[1] through M[4] to get the normal M[i][j] subscripting to work, as in

for ( size_t i = 0; i < 5; i++ )
  M[i] = M[i-1] + 5;

IMO it's a clumsy approach compared to the following:

int (*M)[5] = malloc( sizeof *M * 5 );

This also guarantees that the memory is allocated contiguously, and the M[i][j] subscripting works without any further effort.

However, there is a drawback; on compilers that don't support variable-length arrays, the array size must be known at compile time. Unless your compiler supports VLAs, you can't do something like

size_t cols;
...
int (*M)[cols] = malloc( sizeof *M * rows );

In that case, the M[0] = malloc( rows * cols * sizeof *M[0]) followed by manually assigning M[1] through M[rows - 1] would be a reasonable substitute.

I hope I'm not missing something here but here's my attempt to answer the question "What is the difference...". If I am completely off base, forgive me and I will correct my answer but here goes:

I tried drawing out what is happening in your two mallocs so what I have to say is tied to the picture included which I drew by hand (hand crafted answers?)

First option:

For the first option, you allocate a memory block the size of 5 int*s. M, which is an int** points to the start of that memory block.

Then, you go over each of the memory blocks (the size of int*) and in each block you put in the address of a memory block the size of 5 ints. Note that these are located in some random portion of your memory (the heap) that has enough space to take the size of 5 ints.

This is the key - it's a noncontiguous block of memory. So if you think about memory as an array, you are pointing at different start locations in the array.

Second Option

Your second does the allocation of int** exactly the same. But instead, it allocates the size of 25 ints and returns places the address of that array in the memory block M[0]. Note: you've never placed any address in the memory locations M[1] - M[4].

So, what happens? You have a contiguous block of 25 ints with an address that can be found in M[0]. What happens when you try getting M[1]? You guessed it - it's empty or contains junk values. Even more, it's a value that does not point to an allocated memory space so you Segfault.

在此处输入图片说明

If you want to allocate a 5x5 array in contiguous memory, the correct approach would be

int rows = 5;
int cols = 5;
int (*M)[cols] = malloc(rows * sizeof(*M));

You can then access the array with normal array indexing, eg

M[3][2] = 6;

int **M = malloc(5 * sizeof(int *)); refers to allocating memory for a pointer M[i] = malloc(5 * sizeof(int)); refers to allocating memory for a variable of int.

Maybe this will help you understand what is going on:

int **M  = malloc(5 * sizeof(void *));
/* size of 'void *' and size of 'int *' are the same */
for (i = 0; i < 5; i++)
{
    M[i] =  malloc(5 * sizeof(int));
}

Another little difference when using malloc((5*5) * sizeof(int)); . Certainly a side issue to what OP is looking for, but still a concern.

Both of the below are the same as the order of the 2 operands still result in using size_t math for the product.

#define N 5
malloc(N * sizeof(int));
malloc(sizeof(int) * N);

Consider:

#define N some_large_value
malloc((N*N) * sizeof(int));

The type of the result of sizeof() is type size_t , an unsigned integer type, that is certainly has SIZE_MAX >= INT_MAX , possible far larger. so to avoid int overflow that does not overflow size_t math use

malloc(sizeof(int) * N * N);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM