简体   繁体   中英

strcat implementation using pointers

Could anyone know that when I write in this way, the program crashes.

#include<stdio.h>
#include<stdlib.h>

void mystrcat(char *s,char *t) {
    while(*s++);
    s--;
    while(*s++ = *t++);
}

int main(void) {
    int size = 1024;
    char *s1, *s2;

    s1 = (char *)malloc(size);
    //s1[0] = '\0';   ********NOTE THIS********
    s2 = (char *)malloc(size);
    //s2[0] = '\0';   ********NOTE THIS********
    mystrcat( s1, "Hello " );
    mystrcat( s2, "World" );
    mystrcat( s1, s2 );
    printf( "\"%s\"\n", s1 );
    return 0;
}

But strangely, when I do not use those two "//" comments, it works!!! So why adding those simple s2[0] = '\\0'; could make this program work.

When you allocate memory, either through the old C malloc function or the C++ new operator, that memory is not initialized in any way. Reading that memory as it were initialized leads to undefined behavior , and undefined behavior (or UB as it's often shortened) is one of the main reasons for crashes.

The returned pointer by malloc() is not guaranteed to be 0 -filled, ( or, initialized to any value, at all, for that matter ). So other than the s1[0] = '\\0'; part, while(*s++); may not be doing what you're expecting.

Without the initial zeroing part, while(*s++); cannot prevent the read-before-write scenario.

It is undefined behavior because of

  1. Reading unitialized memory location (indeterminate value)
  2. Going past the allocated memory, in search of terminating null.

In this case, as pointed out by Mr. Peter in the comments, however, the first point itself causes the UB and there is no guarantee that it will reach to the second point. However, in some other scenario, even if the memory is initialized but not null-terminated, you'll hit the second point to invoke the UB.

In C every string is terminated by the '\\0' character.
malloc just allocates the memory, it doesn't writes the '\\0' for You.
If You don't add it the program won't know where is the end of the string, and propably will try to read some memory after the actual string, that is not allocated, so it will cause undefined behaviour.

Here actually the mystrcat function increments the pointer until it points to a '\\0' character or a 0 .
But if there's no 0 found in the allocated memory, then after the next incrementation of the pointer, it will point to some unallocated memory.
Dereferencing it now will cause undefined behaviour.

As the other answers have said, you need to initialize that memory. You can do that in more than one way, but one way would be to use calloc instead of malloc. If you change these two lines:

s1 = (char *)malloc(size);    
s2 = (char *)malloc(size);

to:

s1 = calloc(size,sizeof(*s1));   
s2 = calloc(size,sizeof(*s2));

Your program will run.

When you call malloc , you receive your char* to some memory. You own size bytes of that memory. But malloc does not prepare the memory you now own in any way. It is left in whatever state the previous owning process left it in. Thus, it could very likely be the case that the memory you receive already contains a string that is longer than size .

Thus when you begin your strcat and run to the end of the first string it will run past the length of size , and attempt to start writing to this memory. Here the problem arises, because you don't own the memory at that location, and thus the program segfaults.

On the other hand, if you initialise the strings by letting the first byte be "\\0" , you are in effect setting the length of the string to 0 (because there are 0 bytes before the end token: "\\0" ). Thus when you begin your strcat , it will again run to the end of the first string, but this time the end is within size .

Beware you may still run into problems if the combined string would be longer than size .

The statement

while(*s++);  

make no sense for first two calls of mystrcat and invoke undefined behavior. You should not read an uninitialized memory.

You need to chech whether the first argument passed is an empty string or not

void mystrcat(char *s,char *t) {
    if(strlen(s))
    {
        while(*s++);
        s--;
    }
    while(*s++ = *t++);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM