简体   繁体   English

strcat实现使用指针

[英]strcat implementation using pointers

Could anyone know that when I write in this way, the program crashes. 有人知道,当我这样写时,程序崩溃了。

#include<stdio.h>
#include<stdlib.h>

void mystrcat(char *s,char *t) {
    while(*s++);
    s--;
    while(*s++ = *t++);
}

int main(void) {
    int size = 1024;
    char *s1, *s2;

    s1 = (char *)malloc(size);
    //s1[0] = '\0';   ********NOTE THIS********
    s2 = (char *)malloc(size);
    //s2[0] = '\0';   ********NOTE THIS********
    mystrcat( s1, "Hello " );
    mystrcat( s2, "World" );
    mystrcat( s1, s2 );
    printf( "\"%s\"\n", s1 );
    return 0;
}

But strangely, when I do not use those two "//" comments, it works!!! 但奇怪的是,当我不使用这两个“//”评论时,它有效! So why adding those simple s2[0] = '\\0'; 那么为什么要添加那些简单的s2[0] = '\\0'; could make this program work. 可以使这个程序工作。

When you allocate memory, either through the old C malloc function or the C++ new operator, that memory is not initialized in any way. 当您通过旧的C malloc函数或C ++ new运算符分配内存时,该内存不会以任何方式初始化。 Reading that memory as it were initialized leads to undefined behavior , and undefined behavior (or UB as it's often shortened) is one of the main reasons for crashes. 在初始化时读取该内存会导致未定义的行为 ,并且未定义的行为(或UB因为它经常被缩短)是崩溃的主要原因之一。

The returned pointer by malloc() is not guaranteed to be 0 -filled, ( or, initialized to any value, at all, for that matter ). malloc()返回的指针不保证是0填充的( 或者,就此而言,根据任何值初始化为任何值 )。 So other than the s1[0] = '\\0'; 除了s1[0] = '\\0'; part, while(*s++); part, while(*s++); may not be doing what you're expecting. 可能没有做你期望的事情。

Without the initial zeroing part, while(*s++); 没有初始归零部分, while(*s++); cannot prevent the read-before-write scenario. 无法阻止read-before-write场景。

It is undefined behavior because of 由于这是未定义的行为

  1. Reading unitialized memory location (indeterminate value) 读取单位化内存位置(不确定值)
  2. Going past the allocated memory, in search of terminating null. 经过分配的内存,寻找终止null。

In this case, as pointed out by Mr. Peter in the comments, however, the first point itself causes the UB and there is no guarantee that it will reach to the second point. 在这种情况下,正如彼得先生在评论中指出的那样,第一点本身会导致UB,并且无法保证它将达到第二点。 However, in some other scenario, even if the memory is initialized but not null-terminated, you'll hit the second point to invoke the UB. 但是,在某些其他情况下,即使内存已初始化但未终止,也会触及第二个点来调用UB。

In C every string is terminated by the '\\0' character. 在C中,每个字符串都以'\\ 0'字符终止。
malloc just allocates the memory, it doesn't writes the '\\0' for You. malloc只是分配内存,它不会为你写'\\ 0'。
If You don't add it the program won't know where is the end of the string, and propably will try to read some memory after the actual string, that is not allocated, so it will cause undefined behaviour. 如果您不添加它,程序将不知道字符串的结尾在哪里,并且可能会尝试在实际字符串之后读取一些未分配的内存,因此它将导致未定义的行为。

Here actually the mystrcat function increments the pointer until it points to a '\\0' character or a 0 . 实际上, mystrcat函数会使指针递增,直到它指向'\\ 0'字符或0
But if there's no 0 found in the allocated memory, then after the next incrementation of the pointer, it will point to some unallocated memory. 但是如果在分配的内存中找不到0,那么在指针的下一个增量之后,它将指向一些未分配的内存。
Dereferencing it now will cause undefined behaviour. 现在取消引用它将导致未定义的行为。

As the other answers have said, you need to initialize that memory. 正如其他答案所说,你需要初始化那个内存。 You can do that in more than one way, but one way would be to use calloc instead of malloc. 你可以通过多种方式实现这一点,但一种方法是使用calloc而不是malloc。 If you change these two lines: 如果你改变这两行:

s1 = (char *)malloc(size);    
s2 = (char *)malloc(size);

to: 至:

s1 = calloc(size,sizeof(*s1));   
s2 = calloc(size,sizeof(*s2));

Your program will run. 你的程序将运行。

When you call malloc , you receive your char* to some memory. 当你调用malloc ,你会收到你的char*到某些内存。 You own size bytes of that memory. 你拥有该内存的size字节。 But malloc does not prepare the memory you now own in any way. 但是malloc并没有以任何方式准备你现在拥有的内存。 It is left in whatever state the previous owning process left it in. Thus, it could very likely be the case that the memory you receive already contains a string that is longer than size . 它保留在先前拥有的进程留在其中的任何状态。因此,很可能是您收到的内存已包含长度超过size的字符串。

Thus when you begin your strcat and run to the end of the first string it will run past the length of size , and attempt to start writing to this memory. 因此,当您开始strcat并运行到第一个字符串的末尾时,它将超过size的长度,并尝试开始写入此内存。 Here the problem arises, because you don't own the memory at that location, and thus the program segfaults. 这里出现了问题,因为你没有在那个位置拥有内存,因此程序会出现段错误。

On the other hand, if you initialise the strings by letting the first byte be "\\0" , you are in effect setting the length of the string to 0 (because there are 0 bytes before the end token: "\\0" ). 另一方面,如果通过让第一个字节为"\\0"来初始化字符串,则实际上将字符串的长度设置为0(因为在结束标记之前有0个字节: "\\0" )。 Thus when you begin your strcat , it will again run to the end of the first string, but this time the end is within size . 因此,当你开始你的strcat ,它将再次运行到第一个字符串的末尾,但这次结束是在size范围内。

Beware you may still run into problems if the combined string would be longer than size . 请注意,如果组合字符串长度超过size您可能仍会遇到问题。

The statement 该声明

while(*s++);  

make no sense for first two calls of mystrcat and invoke undefined behavior. mystrcat前两次调用没有任何意义,并调用未定义的行为。 You should not read an uninitialized memory. 你不应该读取未初始化的内存。

You need to chech whether the first argument passed is an empty string or not 您需要检查传递的第一个参数是否为空字符串

void mystrcat(char *s,char *t) {
    if(strlen(s))
    {
        while(*s++);
        s--;
    }
    while(*s++ = *t++);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM