简体   繁体   English

当以不同方式为两个指针分配字符串时,strcpy的行为不同

[英]strcpy behaving differently when two pointers are assigned strings in different ways

I am sorry, I might me asking a dumb question but I want to understand is there any difference in the below assignments? 抱歉,我可能会问一个愚蠢的问题,但我想了解以下作业有什么不同吗? strcpy works in the first case but not in the second case. strcpy在第一种情况下有效,但在第二种情况下无效。

char *str1;
*str1 = "Hello";
char *str2 = "World";
strcpy(str1,str2);    //Works as expected

char *str1 = "Hello";
char *str2 = "World";
strcpy(str1,str2);    //SEGMENTATION FAULT

How does compiler understand each assignment?Please Clarify. 编译器如何理解每个分配?请澄清。

Sorry, both examples are very wrong and lead to undefined behaviour, that might or might not crash. 抱歉,两个示例都非常错误,并导致不确定的行为,可能会崩溃也可能不会崩溃。 Let me try to explain why: 让我尝试解释原因:

  • str1 is a dangling pointer. str1是一个悬空指针。 That means str1 points to somewhere in your memory, writing to str1 can have arbitrary consequences. 这意味着str1指向内存中的某个位置 ,写入str1可能会导致任意后果。 For example a crash or overriding some data in memory (eg. other local variables, variables in other functions, everything is possible) 例如崩溃或覆盖内存中的某些数据(例如其他局部变量,其他函数中的变量,一切皆有可能)
  • The line *str1 = "Hello"; *str1 = "Hello"; is also wrong (even if str1 were a valid pointer) as *str1 has type char ( not char * ) and is the first character of str1 which is dangling. 也是错误的(即使str1是有效的指针),因为*str1类型为char不是 char * ),并且是str1的第一个悬空字符。 However, you assign it a pointer ( "Hello" , type char * ) which is a type error that your compiler will tell you about 但是,您为其分配了一个指针( "Hello" ,类型为char * ),这是编译器将告诉您的类型错误。
  • str2 is a valid pointer but presumably points to read-only memory (hence the crash). str2是有效的指针,但可能指向只读内存(因此崩溃)。 Normally, constant strings are stored in read-only data in the binary, you cannot write to them, but that's exactly what you do in strcpy(str1,str2); 通常,常量字符串存储在二进制文件中的只读数据中,您无法对其进行写入,但这正是您在strcpy(str1,str2);所做的strcpy(str1,str2); .

A more correct example of what you want to achieve might be (with an array on the stack): 您要实现的目标的一个更正确的示例可能是(在堆栈上有一个数组):

#define STR1_LEN 128
char str1[STR1_LEN] = "Hello"; /* array with space for 128 characters */
char *str2 = "World";
strncpy(str1, str2, STR1_LEN);
str1[STR1_LEN - 1] = 0; /* be sure to terminate str1 */

Other option (with dynamically managed memory): 其他选项(具有动态管理的内存):

#define STR1_LEN 128
char *str1 = malloc(STR1_LEN); /* allocate dynamic memory for str1 */
char *str2 = "World";
/* we should check here that str1 is not NULL, which would mean 'out of memory' */
strncpy(str1, str2, STR1_LEN);
str1[STR1_LEN - 1] = 0; /* be sure to terminate str1 */
free(str1); /* free the memory for str1 */
str1 = NULL;

EDIT: @chqrlie requested in the comments that the #define should be named STR1_SIZE not STR1_LEN . 编辑: @chqrlie在注释中请求#define应该命名为STR1_SIZE而不是STR1_LEN Presumably to reduce confusion because it's not the length in characters of the "string" but the length/size of the buffer allocated. 大概是为了减少混乱,因为它不是“字符串”字符的长度,而是分配的缓冲区的长度/大小。 Furthermore, @chqrlie requested not to give examples with the strncpy function. 此外,@ chqrlie请求不要提供带有strncpy函数的示例。 That wasn't really my choice as the OP used strcpy which is very dangerous so I picked the closest function that can be used correctly. 这并不是我真正的选择,因为OP使用strcpy非常危险,因此我选择了可​​以正确使用的最接近的函数。 But yes, I should probably have added, that the use of strcpy , strncpy , and similar functions is not recommended. 但是,是的,我可能应该补充说,不建议使用strcpystrncpy和类似功能。

Edit : In the first snippet you wrote *str1 = "Hello" which is equivalent to assigning to str[0] , which is obviously wrong, because str1 is uninitialized and therefore is an invalid pointer. 编辑 :在第一个代码段中,您编写了*str1 = "Hello" ,这等效于分配给str[0] ,这显然是错误的,因为str1未初始化,因此是无效的指针。 If we assume that you meant str1 = "Hello" , then you are still wrong: 如果我们假设您的意思是str1 = "Hello" ,那么您仍然是错误的:

According to C specs, Attempting to modify a string literal results in undefined behavior: they may be stored in read-only storage (such as .rodata) or combined with other string literals so both snippets that you provided will yield undefined behavior. 根据C规范, 尝试修改字符串文字会导致未定义的行为:它们可能存储在只读存储区(例如.rodata)中,或者与其他字符串文字结合在一起,因此您提供的两个代码片段都将产生未定义的行为。

I can only guess that in the second snippet the compiler is storing the string in some read-only storage, while in the first one it doesn't, so it works, but it's not guaranteed. 我只能猜测,在第二个代码段中,编译器将字符串存储在某些只读存储区中,而在第一个代码段中则不行,因此它可以工作,但不能保证。

There seems to be some confusion here. 这里似乎有些混乱。 Both fragments invoke undefined behaviour. 两个片段都调用未定义的行为。 Let me explain why: 让我解释一下原因:

  • char *str1; defines a pointer to characters, but it is uninitialized. 定义了一个指向字符的指针,但尚未初始化。 It this definition occurs in the body of a function, its value is invalid. 如果此定义出现在函数的主体中,则其值无效。 If this definition occurs at the global level, it is initialized to NULL . 如果此定义发生在全局级别,则将其初始化为NULL

  • *str1 = "Hello"; is an error: you are assigning a string pointer to the character pointed to by str1 . 是一个错误:您正在将字符串指针分配给str1 str1 is uninitialized, so it does not point to anything valid, and you channot assign a pointer to a character. str1是未初始化的,因此它不会指向任何有效值,因此您不应该为字符分配指针。 You should have written str1 = "Hello"; 您应该已经写了str1 = "Hello"; . Furthermore, the string "Hello" is constant, so the definition of str1 really should be const char *str1; 此外,字符串"Hello"是常量,因此str1的定义实际上应为const char *str1; .

  • char *str2 = "World"; Here you define a pointer to a constant string "World" . 在这里,您定义了一个指向常量字符串"World"的指针。 This statement is correct, but it would be better to define str2 as const char *str2 = "World"; 该语句是正确的,但最好将str2定义为const char *str2 = "World"; for the same reason as above. 由于与上述相同的原因。
  • strcpy(str1,str2); //Works as expected strcpy(str1,str2); //Works as expected NO it does not work at all! strcpy(str1,str2); //Works as expected 工作,不,它根本不起作用! str1 does not point to a char array large enough to hold a copy of the string "World" including the final '\\0' . str1没有指向一个足以容纳字符串“ World”的副本的char数组,该副本包括最后的'\\0' Given the circumstances, this code invokes undefined behaviour, which may or may not cause a crash. 在这种情况下,此代码将调用未定义的行为,这可能会或可能不会导致崩溃。

You mention the code works as expected : it only does no in appearance: what really happens is this: str1 is uninitialized, if it pointed to an area of memory that cannot be written, writing to it would likely have crashed the program with a segmentation fault; 您提到了代码按预期的方式工作 :只是在外观上没有:真正发生的是: str1未初始化,如果它指向无法写入的内存区域,则写入该文件很可能会使程序崩溃并产生分段故障; but if it happens to point to an area of memory where you can write, and the next statement *str1 = "Hello"; 但是如果碰巧指向您可以写的内存区域,则下一条语句*str1 = "Hello"; will modify the first byte of this area, then strcpy(str1, "World"); 将修改该区域的第一个字节,然后是strcpy(str1, "World"); will modify the first 6 bytes at that place. 将在该位置修改前6个字节。 The string pointed to by str1 will then be "World", as expected, but you have overwritten some area of memory that may be used for other purposes your program may consequently crash later in unexpected ways, a very hard to find bug! 如预期的那样, str1的字符串将为“ World”,但是您已覆盖了可能用于其他目的的某些内存区域,因此程序可能随后以意外方式崩溃,这是很难发现的错误! This is definitely undefined behaviour . 这绝对是未定义的行为

The second fragment invokes undefined behaviour for a different reason: 第二个片段出于不同的原因调用未定义的行为:

  • char *str1 = "Hello"; No problem, but should be const . 没问题,但是应该是const
  • char *str2 = "World"; OK too, but should also be const . 也可以,但是也应该是const
  • strcpy(str1,str2); //SEGMENTATION FAULT strcpy(str1,str2); //SEGMENTATION FAULT of course it is invalid : you are trying to overwrite the constant character string "Hello" with the characters from the string "World" . strcpy(str1,str2); //SEGMENTATION FAULT 当然无效 :您试图用字符串"World"的字符覆盖常量字符串"Hello" "World" It would work if the string constant was stored in modifiable memory, and would cause even greater confusion later in the program as the value of the string constant was changed. 如果字符串常量存储在可修改的内存中,它将起作用,并且随着字符串常量值的更改,稍后在程序中会引起更大的混乱。 Luckily, most modern environemnts prevent this by storing string constants in a read only memory. 幸运的是,大多数现代环境通过将字符串常量存储在只读存储器中来防止这种情况。 Trying to modify said memory causes a segment violation, ie: you are accessing the data segment of memory in a faulty way. 尝试修改所述内存会导致段冲突,即:您正在以错误的方式访问内存的数据段。

You should use strcpy() only to copy strings to character arrays you define as char buffer[SOME_SIZE]; 您仅应使用strcpy()将字符串复制到您定义为char buffer[SOME_SIZE];字符数组中char buffer[SOME_SIZE]; or allocate as char *buffer = malloc(SOME_SIZE); 或分配为char *buffer = malloc(SOME_SIZE); with SOME_SIZE large enough to hold what you are trying to copy plus the final '\\0' SOME_SIZE足够大,可以容纳您要复制的内容以及最后的'\\0'

Both code are wrong, even if "it works" in your first case. 即使在第一种情况下“有效”,这两个代码都是错误的。 Hopefully this is only an academic question! 希望这只是一个学术问题! :) :)

First let's look at *str1 which you are trying to modify. 首先让我们看一下您要修改的*str1

char *str1;

This declares a dangling pointer, that is a pointer with the value of some unspecified address in the memory. 这声明了一个悬空指针,即在内存中具有某些未指定地址值的指针。 Here the program is simple there is no important stuff, but you could have modified very critical data here! 这里的程序很简单,没有重要的东西,但是您可以在这里修改非常重要的数据!

char *str = "Hello";

This declares a pointer which will point to a protected section of the memory that even the program itself cannot change during execution, this is what a segmentation fault means. 这将声明一个指针,该指针指向内存的受保护部分,即使程序本身在执行过程中也无法更改,这就是分段错误的含义。

To use strcpy(), the first parameter should be a char array dynamically allocated with malloc(). 要使用strcpy(),第一个参数应该是随malloc()动态分配的char数组。 If fact, don't use strcpy(), learn to use strncpy() instead because it is safer. 如果确实如此,请不要使用strcpy(),而应改为使用strncpy(),因为它更安全。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM