简体   繁体   中英

Confusion in “strcat function in C assumes the destination string is large enough to hold contents of source string and its own.”

So I read that strcat function is to be used carefully as the destination string should be large enough to hold contents of its own and source string. And it was true for the following program that I wrote:

#include <stdio.h>
#include <string.h>

int main(){
    char *src, *dest;
    printf("Enter Source String : ");
    fgets(src, 10, stdin);
    printf("Enter destination String : ");
    fgets(dest, 20, stdin);
    strcat(dest, src);
    printf("Concatenated string is %s", dest);
    return 0;
}

But not true for the one that I wrote here:

#include <stdio.h>
#include <string.h>

int main(){
    char src[11] = "Hello ABC";
    char dest[15] = "Hello DEFGIJK";
    strcat(dest, src);
    printf("concatenated string %s", dest);
    getchar();
    return 0;
}

This program ends up adding both without considering that destination string is not large enough. Why is it so?

The strcat function has no way of knowing exactly how long the destination buffer is, so it assumes that the buffer passed to it is large enough. If it's not, you invoke undefined behavior by writing past the end of the buffer. That's what's happening in the second piece of code.

The first piece of code is also invalid because both src and dest are uninitialized pointers. When you pass them to fgets , it reads whatever garbage value they contain, treats it as a valid address, then tries to write values to that invalid address. This is also undefined behavior.

One of the things that makes C fast is that it doesn't check to make sure you follow the rules. It just tells you the rules and assumes that you follow them, and if you don't bad things may or may not happen. In your particular case it appeared to work but there's no guarantee of that.

For example, when I ran your second piece of code it also appeared to work. But if I changed it to this:

#include <stdio.h>
#include <string.h>

int main(){
    char dest[15] = "Hello DEFGIJK";
    strcat(dest, "Hello ABC XXXXXXXXXX");
    printf("concatenated string %s", dest);
    return 0;
}

The program crashes.

I think your confusion is not actually about the definition of strcat . Your real confusion is that you assumed that the C compiler would enforce all the "rules". That assumption is quite false.

Yes, the first argument to strcat must be a pointer to memory sufficient to store the concatenated result. In both of your programs, that requirement is violated. You may be getting the impression, from the lack of error messages in either program, that perhaps the rule isn't what you thought it was, that somehow it's valid to call strcat even when the first argument is not a pointer to enough memory. But no, that's not the case: calling strcat when there's not enough memory is definitely wrong. The fact that there were no error messages, or that one or both programs appeared to "work", proves nothing.

Here's an analogy. (You may even have had this experience when you were a child.) Suppose your mother tells you not to run across the street, because you might get hit by a car. Suppose you run across the street anyway, and do not get hit by a car. Do you conclude that your mother's advice was incorrect? Is this a valid conclusion?

In summary, what you read was correct: strcat must be used carefully. But let's rephrase that: you must be careful when calling strcat . If you're not careful, all sorts of things can go wrong, without any warning. In fact, many style guides recommend not using functions such as strcat at all, because they're so easy to misuse if you're careless. (Functions such as strcat can be used perfectly safely as long as you're careful -- but of course not all programmers are sufficiently careful.)

The strcat() function is indeed to be used carefully because it doesn't protect you from anything. If the source string isn't NULL-terminated, the destination string isn't NULL-terminated, or the destination string doesn't have enough space, strcat will still copy data. Therefore, it is easy to overwrite data you didn't mean to overwrite. It is your responsibility to make sure you have enough space. Using strncat() instead of strcat will also give you some extra safety.

Edit Here's an example:

#include <stdio.h>
#include <string.h>

int main()
{
    char s1[16] = {0};
    char s2[16] = {0};
    strcpy(s2, "0123456789abcdefOOPS WAY TOO LONG");
      /* ^^^ purposefully copy too much data into s2 */
    printf("-%s-\n",s1);
    return 0;
}

I never assigned to s1 , so the output should ideally be -- . However, because of how the compiler happened to arrange s1 and s2 in memory, the output I actually got was -OOPS WAY TOO LONG- . The strcpy(s2,...) overwrote the contents of s1 as well.

On gcc, -Wall or -Wstringop-overflow will help you detect situations like this one, where the compiler knows the size of the source string. However, in general, the compiler can't know how big your data will be. Therefore, you have to write code that makes sure you don't copy more than you have room for.

Both snippets invoke undefined behavior - the first because src and dest are not initialized to point anywhere meaningful, and the second because you are writing past the end of the array.

C does not enforce any kind of bounds checking on array accesses - you won't get an "Index out of range" exception if you try to write past the end of an array. You may get a runtime error if you try to access past a page boundary or clobber something important like the frame pointer, but otherwise you just risk corrupting data in your program.

Yes, you are responsible for making sure the target buffer is large enough for the final string. Otherwise the results are unpredictable.

I'd like to point out what is actually happening in the 2nd program in order to illustrate the problem.

It allocates 15 bytes at the memory location starting at dest and copies 14 bytes into it (including the null terminator):

    char dest[15] = "Hello DEFGIJK";

...and 11 bytes at src with 10 bytes copied into it:

    char src[11] = "Hello ABC";

The strcat() call then copies 10 bytes (9 chars plus the null terminator) from src into dest, starting right after the 'K' in dest. The resulting string at dest will be 23 bytes long including the null terminator. The problem is, you allocated only 15 bytes at dest, and the memory adjacent to that memory will be overwritten, ie corrupted, leading to program instability, wrong results, data corruption, etc.

Note that the strcat() function knows nothing about the amount of memory you've allocated at dest (or src, for that matter). It is up to you to make sure you've allocated enough memory at dest to prevent memory corruption.

By the way, the first program doesn't allocate memory at dest or src at all, so your calls to fgets() are corrupting memory starting at those locations.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM