简体   繁体   中英

Does malloc() allocate a contiguous block of memory?

I have a piece of code written by a very old school programmer:-). it goes something like this

typedef struct ts_request
{ 
  ts_request_buffer_header_def header; 
  char                         package[1]; 
} ts_request_def; 

ts_request_def* request_buffer = 
malloc(sizeof(ts_request_def) + (2 * 1024 * 1024));

the programmer basically is working on a buffer overflow concept. I know the code looks dodgy. so my questions are:

  1. Does malloc always allocate contiguous block of memory? because in this code if the blocks are not contiguous, the code will fail big time

  2. Doing free(request_buffer) , will it free all the bytes allocated by malloc ie sizeof(ts_request_def) + (2 * 1024 * 1024) , or only the bytes of the size of the structure sizeof(ts_request_def)

  3. Do you see any evident problems with this approach, I need to discuss this with my boss and would like to point out any loopholes with this approach

To answer your numbered points.

  1. Yes.
  2. All the bytes. Malloc/free doesn't know or care about the type of the object, just the size.
  3. It is strictly speaking undefined behaviour, but a common trick supported by many implementations. See below for other alternatives.

The latest C standard, ISO/IEC 9899:1999 (informally C99), allows flexible array members .

An example of this would be:

int main(void)
{       
    struct { size_t x; char a[]; } *p;
    p = malloc(sizeof *p + 100);
    if (p)
    {
        /* You can now access up to p->a[99] safely */
    }
}

This now standardized feature allowed you to avoid using the common, but non-standard, implementation extension that you describe in your question. Strictly speaking, using a non-flexible array member and accessing beyond its bounds is undefined behaviour, but many implementations document and encourage it.

Furthermore, gcc allows zero-length arrays as an extension. Zero-length arrays are illegal in standard C, but gcc introduced this feature before C99 gave us flexible array members.

In a response to a comment, I will explain why the snippet below is technically undefined behaviour. Section numbers I quote refer to C99 (ISO/IEC 9899:1999)

struct {
    char arr[1];
} *x;
x = malloc(sizeof *x + 1024);
x->arr[23] = 42;

Firstly, 6.5.2.1#2 shows a[i] is identical to (*((a)+(i))), so x->arr[23] is equivalent to (*((x->arr)+(23))). Now, 6.5.6#8 (on the addition of a pointer and an integer) says:

"If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined ."

For this reason, because x->arr[23] is not within the array, the behaviour is undefined. You might still think that it's okay because the malloc() implies the array has now been extended, but this is not strictly the case. Informative Annex J.2 (which lists examples of undefined behaviour) provides further clarification with an example:

An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression a[1][7] given the declaration int a[4][5]) (6.5.6).

3 - That's a pretty common C trick to allocate a dynamic array at the end of a struct. The alternative would be to put a pointer into the struct and then allocate the array separately, and not forgetting to free it too. That the size is fixed to 2mb seems a bit unusual though.

1) Yes it does, or malloc will fail if there isn't a large enough contiguous block available. (A failure with malloc will return a NULL pointer)

2) Yes it will. The internal memory allocation will keep track of the amount of memory allocated with that pointer value and free all of it.

3)It's a bit of a language hack, and a bit dubious about it's use. It's still subject to buffer overflows as well, just may take attackers slightly longer to find a payload that will cause it. The cost of the 'protection' is also pretty hefty (do you really need >2mb per request buffer?). It's also very ugly, although your boss may not appreciate that argument:)

This is a standard C trick, and isn't more dangerous that any other buffer.

If you are trying to show to your boss that you are smarter than "very old school programmer", this code isn't a case for you. Old school not necessarily bad. Seems the "old school" guy knows enough about memory management;)

I don't think the existing answers quite get to the essence of this issue. You say the old-school programmer is doing something like this;

typedef struct ts_request
{ 
  ts_request_buffer_header_def header; 
  char                         package[1]; 
} ts_request_def;

ts_request_buffer_def* request_buffer = 
malloc(sizeof(ts_request_def) + (2 * 1024 * 1024));

I think it's unlikely he's doing exactly that, because if that's what he wanted to do he could do it with simplified equivalent code that doesn't need any tricks;

typedef struct ts_request
{ 
  ts_request_buffer_header_def header; 
  char                         package[2*1024*1024 + 1]; 
} ts_request_def;

ts_request_buffer_def* request_buffer = 
malloc(sizeof(ts_request_def));

I'll bet that what he's really doing is something like this;

typedef struct ts_request
{ 
  ts_request_buffer_header_def header; 
  char                         package[1]; // effectively package[x]
} ts_request_def;

ts_request_buffer_def* request_buffer = 
malloc( sizeof(ts_request_def) + x );

What he wants to achieve is allocation of a request with a variable package size x. It is of course illegal to declare the array's size with a variable, so he is getting around this with a trick. It looks as if he knows what he's doing to me, the trick is well towards the respectable and practical end of the C trickery scale.

As for #3, without more code it's hard to answer. I don't see anything wrong with it, unless its happening a lot. I mean, you don't want to allocate 2mb chunks of memory all the time. You also don't want to do it needlessly, eg if you only ever use 2k.

The fact that you don't like it for some reason isn't sufficient to object to it, or justify completely re-writing it. I would look at the usage closely, try to understand what the original programmer was thinking, look closely for buffer overflows (as workmad3 pointed out) in the code that uses this memory.

There are lots of common mistakes that you may find. For example, does the code check to make sure malloc() succeeded?

The exploit (question 3) is really up to the interface towards this structure of yours. In context this allocation might make sense, and without further information it is impossible to say if it's secure or not.
But if you mean problems with allocating memory bigger than the structure, this is by no means a bad C design (I wouldn't even say it's THAT old school... ;) )
Just a final note here - the point with having a char[1] is that the terminating NULL will always be in the declared struct, meaning there can be 2 * 1024 * 1024 characters in the buffer, and you don't have to account for the NULL by a "+1". Might look like a small feat, but I just wanted to point out.

I've seen and used this pattern frequently.

Its benefit is to simplify memory management and thus avoid risk of memory leaks. All it takes is to free the malloc'ed block. With a secondary buffer, you'll need two free. However one should define and use a destructor function to encapsulate this operation so you can always change its behavior, like switching to secondary buffer or add additional operations to be performed when deleting the structure.

Access to array elements is also slightly more efficient but that is less and less significant with modern computers.

The code will also correctly work if memory alignment changes in the structure with different compilers as it is quite frequent.

The only potential problem I see is if the compiler permutes the order of storage of the member variables because this trick requires that the package field remains last in the storage. I don't know if the C standard prohibits permutation.

Note also that the size of the allocated buffer will most probably be bigger than required, at least by one byte with the additional padding bytes if any.

Yes. malloc returns only a single pointer - how could it possibly tell a requester that it had allocated multiple discontiguous blocks to satisfy a request?

Would like to add that not is it common but I might also called it a standard practice because Windows API is full of such use.

Check the very common BITMAP header structure for example.

http://msdn.microsoft.com/en-us/library/aa921550.aspx

The last RBG quad is an array of 1 size, which depends on exactly this technique.

In response to your third question.

free always releases all the memory allocated at a single shot.

int* i = (int*) malloc(1024*2);

free(i+1024); // gives error because the pointer 'i' is offset

free(i); // releases all the 2KB memory

The answer to question 1 and 2 is Yes

About ugliness (ie question 3) what is the programmer trying to do with that allocated memory?

the thing to realize here is that malloc does not see the calculation being made in this

malloc(sizeof(ts_request_def) + (2 * 1024 * 1024));

Its the same as

  int sz = sizeof(ts_request_def) + (2 * 1024 * 1024);
   malloc(sz);

YOu might think that its allocating 2 chunks of memory, and in yr mind they are "the struct", "some buffers". But malloc doesnt see that at all.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM