简体   繁体   中英

How to use malloc() and memset()

I am very new to C and trying to implement a copy_buffer function to allocate a new buffer, copy the contents from an existing buffer into the new buffer, and return the new buffer.

I am trying to use malloc() and memset() , and I understand I need to malloc twice: one for the struct and one for the data. Do I need to memset twice too? Please point out where I wrote wrong for copy_buffer function.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// Structure to hold a buffer of binary data
// Note that we expect that it could hold null(0) characters,
// so we can't rely on strlen to tell us the size or strcpy to copy the data
typedef struct BufferStruct {
    char *data;
    int size;
} Buffer;

Buffer *copy_buffer(Buffer *buffer) {
    Buffer *new_buffer = (Buffer*)malloc(sizeof(Buffer));
    memset(new_buffer, 0, sizeof(Buffer));
    new_buffer->data = malloc(sizeof(char));
    memset(new_buffer->data, 0, sizeof(char));
    for (int i = 0; i < buffer->size; ++i) {
        new_buffer->data[i] = buffer->data[i];
    }
    new_buffer->size = buffer->size;
    return new_buffer;
}

Buffer *example1() {
    Buffer *buffer = (Buffer*)malloc(sizeof(Buffer));
    buffer->data = "hello world\nthis is a string";
    buffer->size = strlen(buffer->data);
    return buffer;
}

// Example buffer storing 3 totally different strings in the same buffer (note the '\0')
Buffer *example2() {
    Buffer *buffer = (Buffer*)malloc(sizeof(Buffer));
    buffer->data = "this string has null\0characters\0 in the middle, beware!";
    buffer->size = 55;
    return buffer;
}

//
// Use fopen to create a file for writing
// Then fwrite to write the buffer to a file
//
void write_buffer(const char *filename, Buffer *buffer) {
    FILE *file = fopen(filename, "w");
    fwrite(buffer->data, 1, buffer->size, file);
    fclose(file);
}

int main() {
    Buffer *example = example2();
    Buffer *copied = copy_buffer(example);

    write_buffer("example1.bin", copied);
    return 0;
}

Initialization

Do I need to memset twice too?

Technically no, provided you properly initialize all elements of the Buffer struct before using them.

I feel this is a risky habit, however. It's very difficult to be consistent, and in some contexts the program can crash if you make a mistake. Many developers seem to recommend initializing or zero-initializing variables, particularly data buffers, but opinions (and circumstances) vary.

You can use calloc with the same effect as malloc and memset :

Buffer *buffer = calloc(1, sizeof(Buffer));

This allocates and zero-initializes memory.

Memory Issues

As @Chris Rollins pointed out,

new_buffer->data = malloc(sizeof(char));

only allocates a single byte of storage because sizeof(char) is 1. When the program copies data from from one buffer to the other, it starts overwriting unknown memory on the heap!

for (int i = 0; i < buffer->size; ++i) {
    // new_buffer only has space for one byte,
    // but we can copy several!
    new_buffer->data[i] = buffer->data[i];  // BOOM
}

This is a buffer overflow . It is serious flaw and can/will crash the program. You aren't alone; similar mistakes have cause many security bugs.

Runtime memory analysis can catch bugs like this. If you are compiling with Clang or gcc, the address sanitizer (often called ASAN) will point out the specific error.

Compile with something like this:

gcc main.c -o main -fsanitize=address -fno-omit-frame-pointer -g

and you'll get an error message that points out the faulty lines of code:

=================================================================
==28677==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000efb1 at pc 0x000000400d29 bp 0x7fffa8a5db00 sp 0x7fffa8a5daf0
WRITE of size 1 at 0x60200000efb1 thread T0
    #0 0x400d28 in copy_buffer main.c:20
    #1 0x400f5d in main main.c:52
    #2 0x7ff7256cf82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #3 0x400a88 in _start (main+0x400a88)

0x60200000efb1 is located 0 bytes to the right of 1-byte region [0x60200000efb0,0x60200000efb1)
allocated by thread T0 here:
    #0 0x7ff725b11602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
    #1 0x400c09 in copy_buffer main.c:17
    #2 0x400f5d in main main.c:52
    #3 0x7ff7256cf82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

Dense, but worth deciphering:

  • there's a buffer overflow on the heap: heap-buffer-overflow on address
  • the call stack of the offending function call: in copy_buffer main.c:20 in main main.c:52
  • where the memory was allocated: in copy_buffer main.c:17 in main main.c:52

The line numbers are specific to my .c file and may vary on your machine.

memcpy

As @Steve Summit pointed out, you can replace the for loop:

for (int i = 0; i < buffer->size; ++i) {
    new_buffer->data[i] = buffer->data[i];
}

with a single call to memcpy :

memcpy(new_buffer->data, buffer->data, buffer->size);

this has the same risks as the for loop but is more concise/easier to read. It should be just as fast, too.

Solution

Incorporating these ideas into copy_buffer could look something like this:

Buffer *copy_buffer(Buffer *buffer) {
    Buffer *new_buffer = calloc(1, sizeof(Buffer));
    new_buffer->data = calloc(buffer->size, sizeof(char));
    memcpy(new_buffer->data, buffer->data, buffer->size);
    new_buffer->size = buffer->size;
    return new_buffer;
}
  • calloc allocates and initializes memory on the heap
  • new_buffer->data is large enough to hold all of buffer 's data
  • memcpy() concisely copies data from one buffer to another

Of course, don't forget to free the memory that you allocate!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM