简体   繁体   中英

Confusion About Array and String in C

What is the difference between S1 , S2 and S3 ?

char S1[6];
S1[0] = 'A';
S1[1] = 'r';
S1[2] = 'r';
S1[3] = 'a';
S1[4] = 'y';

char S2[6] = {'A','r','r','a','y'};

string S3 = "Array";

When I run the program using if (strcmp(a,b) == 0) , where a, b = S1, S2, S3 . It shows that S2 and S3 are the same, and S1 and S2 is different. Why is this the case?? Why not all three are equivalent?

And when I add back '\\0' to both S1b , S1c . All 3 are the same. This is understandable.

BUT why in my first trial, S2 and S3 are the same then?? I did not include '\\0' too. And I suspect S1 and S2 should be the same, but not S2 and S3 .

Can anyone tell me why my thought is wrong???

Thanks for your answers. I have tried and changed the settings to the followings:

char S1[5];
S1[0] = 'A';
S1[1] = 'r';
S1[2] = 'r';
S1[3] = 'a';
S1[4] = 'y';

char S2[5] = {'A','r','r','a','y'};

string S3 = "Array";

And now clearly S2 and S3 are not the same, since they differs by a '\\0' . However, I am still a bit confused why S1 and S2 are not the same this time again if I use strcmp to compare the two?

Compare the actual in-memory values of the arrays:

  1. S1 is 6 elements big, yet you only specify values for 0-5, the 6th element is not explicitly set, so it retains whatever value the memory location had prior to allocation.
  2. S2 is similar to S1 , only 5 elements are provided, however when using the {,} syntax any extra elements are zeroed. So char foo[5] = { 1, 2 } is identical to char foo[5] = { 1, 2, 0, 0, 0} .
  3. S3 uses the string syntax way of initialising an array, which creates an array of char (or wchar_t ) with an extra element set to \\0 (the null terminator).

Visually:

S1 = 0x41, 0x72, 0x72, 0x61, 0x79, 0x??
S2 = 0x41, 0x72, 0x72, 0x61, 0x79, 0x00
S3 = 0x41, 0x72, 0x72, 0x61, 0x79, 0x00

Note that you're running into a safety problem with strcmp : it doesn't have a length parameter, it keeps on searching until it encounters \\0 , which might be never (ie until it causes a segfault or access violation). Instead use a safer function like strncmp or (if using C++) the std::string type.

It shows that S2 and S3 are the same, and S1 and S2 is different.

S3 contains the nul terminator which S1 does not have. This string S3 = "Array"; means

| A | r | r | a | y | \0 |

While S2 is

| A | r | r | a | y | \0 |

While S1 is

| A | r | r | a | y | Garbage |

S1 and S2 comparison can lead to UB (i presume) because S1 is not nul-terminated and there is no length which we pass in strcmp .

#include <stdio.h>
#include <string.h>

int main(void) 
{
    char S1[6];
    S1[0] = 'A';
    S1[1] = 'r';
    S1[2] = 'r';
    S1[3] = 'a';
    S1[4] = 'y';
    S1[5] = 0;

    char S2[6] = {'A','r','r','a','y', 0};
    printf("%d" ,strcmp(S1,S2));
    return 0;
}

Outputs:

0

strcmp() function starts comparing the first character of each string. If they are equal to each other, it continues with the following pairs until the characters differ or until a terminating null-character is reached.

I don't think it is safe to compare S1 and S2 using this. Input to strcmp is the address of first character. S1 is not null-terminated. Though 6 bytes are allocated in both cases, S1[5] is not initialised. Chances are that they have some garbage value. The risk here is that strcmp will end up comparing un-allocated memory also, in the search for character diff or null character. This can even lead to seg fault or access violation.

Visualising memory alignment of S1,S2,S3 might be something like this

S1 = A | r | r | a | y | ?
S2 = A | r | r | a | y | 0
S3 = A | r | r | a | y | 0

Any comparison between S2 and S3 is safe. S1 vs S2 or S3 might not be.

Just adding to the existing answers

char S2[6] = {'A','r','r','a','y'};

string S3 = "Array";

Both are NULL terminated and hence strcmp() works well and says that they both are same. While for S1 the assignment is done explicitly there is no NULL termination for this array. So this is not a valid string in C. So using strcmp() might lead to undefined behavior.

The point with S3 is that S3 is a string literal which is read-only. Mostly these sort of values are stored in read-only locations. So when you try to write something to S3 after initialization you might see a crash.So we should keep this in mind while using assignments like S3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM