简体   繁体   English

“C语言中的strcat函数混淆假设目标字符串足够大,可以保存源字符串及其自身的内容。”

[英]Confusion in “strcat function in C assumes the destination string is large enough to hold contents of source string and its own.”

So I read that strcat function is to be used carefully as the destination string should be large enough to hold contents of its own and source string. 所以我读到strcat函数要小心使用,因为目标字符串应该足够大,以保存自己和源字符串的内容。 And it was true for the following program that I wrote: 我写的以下程序也是如此:

#include <stdio.h>
#include <string.h>

int main(){
    char *src, *dest;
    printf("Enter Source String : ");
    fgets(src, 10, stdin);
    printf("Enter destination String : ");
    fgets(dest, 20, stdin);
    strcat(dest, src);
    printf("Concatenated string is %s", dest);
    return 0;
}

But not true for the one that I wrote here: 但对于我在这里写的那个不是这样的:

#include <stdio.h>
#include <string.h>

int main(){
    char src[11] = "Hello ABC";
    char dest[15] = "Hello DEFGIJK";
    strcat(dest, src);
    printf("concatenated string %s", dest);
    getchar();
    return 0;
}

This program ends up adding both without considering that destination string is not large enough. 该程序最终添加两者而不考虑目标字符串不够大。 Why is it so? 为什么会这样?

The strcat function has no way of knowing exactly how long the destination buffer is, so it assumes that the buffer passed to it is large enough. strcat函数无法确切知道目标缓冲区的长度,因此它假定传递给它的缓冲区足够大。 If it's not, you invoke undefined behavior by writing past the end of the buffer. 如果不是,则通过写入缓冲区的末尾来调用未定义的行为 That's what's happening in the second piece of code. 这就是第二段代码中发生的事情。

The first piece of code is also invalid because both src and dest are uninitialized pointers. 第一段代码也是无效的,因为srcdest都是未初始化的指针。 When you pass them to fgets , it reads whatever garbage value they contain, treats it as a valid address, then tries to write values to that invalid address. 当它们传递给fgets ,它会读取它们包含的任何垃圾值,将其视为有效地址,然后尝试将值写入该无效地址。 This is also undefined behavior. 这也是未定义的行为。

One of the things that makes C fast is that it doesn't check to make sure you follow the rules. 使C快速的一个原因是它不会检查以确保您遵守规则。 It just tells you the rules and assumes that you follow them, and if you don't bad things may or may not happen. 它只是告诉你规则,并假设你遵循它们,如果你没有坏事,可能会或可能不会发生。 In your particular case it appeared to work but there's no guarantee of that. 在你的特殊情况下,它似乎工作,但不能保证。

For example, when I ran your second piece of code it also appeared to work. 例如,当我运行你的第二段代码时,它似乎也有效。 But if I changed it to this: 但如果我改成它:

#include <stdio.h>
#include <string.h>

int main(){
    char dest[15] = "Hello DEFGIJK";
    strcat(dest, "Hello ABC XXXXXXXXXX");
    printf("concatenated string %s", dest);
    return 0;
}

The program crashes. 程序崩溃了。

I think your confusion is not actually about the definition of strcat . 我认为你的困惑实际上并不是关于strcat的定义。 Your real confusion is that you assumed that the C compiler would enforce all the "rules". 您真正的困惑是您认为C编译器会强制执行所有“规则”。 That assumption is quite false. 这个假设是非常错误的。

Yes, the first argument to strcat must be a pointer to memory sufficient to store the concatenated result. 是的, strcat的第一个参数必须是一个指向内存的指针,足以存储连接的结果。 In both of your programs, that requirement is violated. 在您的两个程序中,都违反了该要求。 You may be getting the impression, from the lack of error messages in either program, that perhaps the rule isn't what you thought it was, that somehow it's valid to call strcat even when the first argument is not a pointer to enough memory. 你可能会从任何一个程序中缺少错误消息得到这样的印象:也许规则不是你想象的那样,即使第一个参数不是指向足够内存的指针,它也会以某种方式调用strcat But no, that's not the case: calling strcat when there's not enough memory is definitely wrong. 但不是,情况并非如此:当内存不足时调用strcat肯定是错误的。 The fact that there were no error messages, or that one or both programs appeared to "work", proves nothing. 没有错误消息,或者一个或两个程序似乎“正常”的事实证明没有任何证据。

Here's an analogy. 这是一个类比。 (You may even have had this experience when you were a child.) Suppose your mother tells you not to run across the street, because you might get hit by a car. (你小时候甚至可能有这种经历。)假设你的母亲告诉你不要跑到街对面,因为你可能会被车撞到。 Suppose you run across the street anyway, and do not get hit by a car. 无论如何,假设你跑到街对面,不要被车撞到。 Do you conclude that your mother's advice was incorrect? 你是否认为你母亲的建议不正确? Is this a valid conclusion? 这是一个有效的结论吗?

In summary, what you read was correct: strcat must be used carefully. 总之,您阅读的内容是正确的:必须小心使用strcat But let's rephrase that: you must be careful when calling strcat . 但是,让我们换一种说法:打电话时,你一定要小心strcat If you're not careful, all sorts of things can go wrong, without any warning. 如果你不小心,各种各样的事情都可能出错,没有任何警告。 In fact, many style guides recommend not using functions such as strcat at all, because they're so easy to misuse if you're careless. 事实上,许多风格指南建议不要使用strcat功能,因为如果你不小心它们就很容易被误用。 (Functions such as strcat can be used perfectly safely as long as you're careful -- but of course not all programmers are sufficiently careful.) (只要你小心, strcat功能就可以完全安全地使用 - 但当然并非所有程序员都非常小​​心。)

The strcat() function is indeed to be used carefully because it doesn't protect you from anything. 确实要小心使用strcat()函数 ,因为它不能保护您免受任何伤害。 If the source string isn't NULL-terminated, the destination string isn't NULL-terminated, or the destination string doesn't have enough space, strcat will still copy data. 如果源字符串不以NULL结尾,则目标字符串不以NULL结尾,或者目标字符串没有足够的空间, strcat仍将复制数据。 Therefore, it is easy to overwrite data you didn't mean to overwrite. 因此,很容易覆盖您不想覆盖的数据。 It is your responsibility to make sure you have enough space. 您有责任确保有足够的空间。 Using strncat() instead of strcat will also give you some extra safety. 使用strncat()而不是strcat也会给你一些额外的安全性。

Edit Here's an example: 编辑这是一个例子:

#include <stdio.h>
#include <string.h>

int main()
{
    char s1[16] = {0};
    char s2[16] = {0};
    strcpy(s2, "0123456789abcdefOOPS WAY TOO LONG");
      /* ^^^ purposefully copy too much data into s2 */
    printf("-%s-\n",s1);
    return 0;
}

I never assigned to s1 , so the output should ideally be -- . 我从未分配过s1 ,所以理想情况下输出应该是-- However, because of how the compiler happened to arrange s1 and s2 in memory, the output I actually got was -OOPS WAY TOO LONG- . 但是,由于编译器如何在内存中排列s1s2 ,我实际得到的输出是-OOPS WAY TOO LONG- The strcpy(s2,...) overwrote the contents of s1 as well. strcpy(s2,...)覆盖了s1的内容。

On gcc, -Wall or -Wstringop-overflow will help you detect situations like this one, where the compiler knows the size of the source string. 在gcc上, -Wall-Wstringop-overflow将帮助您检测类似这样的情况,编译器知道源字符串的大小。 However, in general, the compiler can't know how big your data will be. 但是,通常,编译器无法知道您的数据有多大。 Therefore, you have to write code that makes sure you don't copy more than you have room for. 因此,您必须编写代码,以确保您不会复制超过您的空间。

Both snippets invoke undefined behavior - the first because src and dest are not initialized to point anywhere meaningful, and the second because you are writing past the end of the array. 两个片段都调用未定义的行为 - 第一个因为srcdest未初始化为指向任何有意义的行为,第二个因为您正在写入数组的末尾。

C does not enforce any kind of bounds checking on array accesses - you won't get an "Index out of range" exception if you try to write past the end of an array. C不会对数组访问强制执行任何类型的边界检查 - 如果您尝试写入数组的末尾,则不会获得“索引超出范围”异常。 You may get a runtime error if you try to access past a page boundary or clobber something important like the frame pointer, but otherwise you just risk corrupting data in your program. 如果您尝试访问页面边界或重写某些重要内容(如帧指针),则可能会出现运行时错误,但您只是冒着破坏程序中数据的风险。

Yes, you are responsible for making sure the target buffer is large enough for the final string. 是的,您有责任确保目标缓冲区足够大以容纳最终字符串。 Otherwise the results are unpredictable. 否则结果是不可预测的。

I'd like to point out what is actually happening in the 2nd program in order to illustrate the problem. 我想指出第二个程序实际发生了什么,以说明问题。

It allocates 15 bytes at the memory location starting at dest and copies 14 bytes into it (including the null terminator): 它在从dest开始的内存位置分配15个字节,并将14个字节复制到其中(包括空终止符):

    char dest[15] = "Hello DEFGIJK";

...and 11 bytes at src with 10 bytes copied into it: ...在src上有11个字节,其中复制了10个字节:

    char src[11] = "Hello ABC";

The strcat() call then copies 10 bytes (9 chars plus the null terminator) from src into dest, starting right after the 'K' in dest. 然后strcat()调用从src复制10个字节(9个字符加上空终止符)到dest,从dest中的'K'开始。 The resulting string at dest will be 23 bytes long including the null terminator. dest处的结果字符串将是23个字节长,包括空终止符。 The problem is, you allocated only 15 bytes at dest, and the memory adjacent to that memory will be overwritten, ie corrupted, leading to program instability, wrong results, data corruption, etc. 问题是,你在dest中只分配了15个字节,并且与该内存相邻的内存将被覆盖,即损坏,导致程序不稳定,结果错误,数据损坏等。

Note that the strcat() function knows nothing about the amount of memory you've allocated at dest (or src, for that matter). 请注意,strcat()函数对于您在dest(或src)上分配的内存量一无所知。 It is up to you to make sure you've allocated enough memory at dest to prevent memory corruption. 您可以确保在dest上分配了足够的内存以防止内存损坏。

By the way, the first program doesn't allocate memory at dest or src at all, so your calls to fgets() are corrupting memory starting at those locations. 顺便说一句,第一个程序根本不在dest或src分配内存,所以你对fgets()的调用会破坏从那些位置开始的内存。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM