简体   繁体   中英

Confusion on strings in C programming

So i am learning to program c using the compiler Dev C++. Question 1:

#include <stdio.h> 
#include <conio.h> //for the getch() function
#include <string.h> 

int main(void) 
{ 
    char line[3]; 
    strcpy(line, "Hello world"); 
    printf("%s", line); 
    getch(); 
} 

Output: Hello world

Why is it that it displays all of "Hello world" when i declared my string to only hold 3 characters?

Question 2:

char line[3] = "Hello world"; 
printf("%s", line); 

Output: Hel

Why is it that it displays "Hel"? Shouldnt it display only "He" since line[0] = H, line[1] = e and line[2] = '\\0'? And the %s works by searching for a '\\0'?

Please help me understand whats really happening. Thanks!

Please help me understand whats really happening.

Undefined behaviour!

When you do this, you've a buffer overrun :

char line[3]; 
strcpy(line, "Hello world"); 

Why is it that it displays all of "Hello world" when i declared my string to only hold 3 characters?

You're copying more than the size of the allocated array. This is undefined behaviour and thus any output is possible, including but not limited to, calling aunt Tilda, formatting your hard disk, etc. :) See here for more.


char line[3] = "Hello world"; 
printf("%s", line); 

Here you've a buffer over-read ! Refer to alk's answer on why only 3 characters would get copied to line .

Why is it that it displays "Hel"? Shouldnt it display only "He"

No, it can display anything, again because of undefined behaviour. See what output I get on my machine:

Hel☻

This is undefined behaviour because printf expects you to have a null-terminated string, yes, but that doesn't mean you can access beyond the size of an array ie you've an array like this in memory

  [0] [1] [2] ----------------------------------------------- . . . █ | █ | █ | H | e | l | █ | █ | █ | . . . ----------------------------------------------- <-- line ---> 

Any thing written as █ above is an unknown value, not under your powers and thus accessing them is undefined. However, %s in printf expects a null-terminated string and thus, under your orders, it reads beyond what's allowed (what is allowed is just three elements till l ). In my case \\0 appeared one element after l (the smiley), while in your case it's just after l thus appearing correct but only by luck, it may well appear 1000 elements later.


If you really want to print the char array, which is not null-terminated, only up to the allowed limit, you could do one of these without hitting any undefined behaviours.

printf("%.3s", line);       // length specified at compile-time

printf("%.*s", 3, line);    // length fed at run-time

See here for further information.

Referring Question 2:

When using a "string"-literal as initialiser, the 0 -terminator is applied only if there's room for it.

From the C99-Standard :

6.7.8 Initialization

[...]

14 An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

The both examples of programs have undefined behaviour. In the first example you overwrite the memory beyond the array. In the second example C does not allow to use more initializers than an object can accepts.

2 No initializer shall attempt to provide a value for an object not contained within the entity being initialized.

The only exclusion is done for character arrays that are allowed to ignore the terminating zero

14 An array of character type may be initialized by a character string literal or UTF−8 string literal, optionally enclosed in braces. Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

So the second code-snippet shall not be compiled or at least the compiler shall issue a diagnostic message.

Why is it that it displays all of "Hello world" when i declared my string to only hold 3 characters?

Because printf() reads a string upto a null terminator. It doesn't know how big the storage is, and neither does strcpy() ; if you want to make sure the copy doesn't exceed the length of the storage, use strncpy() (notice the n in the middle).

Why is it that it displays "Hel"?

There doesn't have to be a explanation for this, since you've already overflowed the buffer -- this could have any kind of bizarre consequence for the program. You could have overwritten something else (and conversely, your data might get overwritten subsequently). If you break the rules, you are most likely invoking "undefined behaviour".

It could be in this case that the compiler only wrote 3 characters because of the form of the initialization, but that is not something to count on -- there aren't necessarily rules for what happens when you break the rules.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM