简体   繁体   中英

C style strings, Pointers, arrays

I'm having trouble understanding what a C-style string is. Happy early New Year

What I know: A pointer holds a memory address. Dereferencing the pointer will give you the data at that memory location.

int x = 50;
int* ptr = &x;    //pointer to an integer, holds memory address of x

cout << "&x: " << &x << endl;  //these two lines give the same output as expected
cout << "ptr: " << ptr << endl;

cout << "*ptr: " << dec << (*ptr) << endl;  //prints out decimal number 50
                                           //added dec, so the program doesnt 
                //continue to printout hexidecimal numbers like it did for the 
                 //the memory addresses above
cout << "&ptr: " << &ptr << endl;  //shows that a pointer, like any variable,
                                  //has its own memory address

Now to what I don't understand (using what's above as the source for my confusion): There are multiple ways to declare strings. I'm learning C++, but you can also use C-style strings (good to understand, although inferior to C++ strings)

C++:

string intro = "Hello world!"; 
//the compiler will automatically add a null character, \0, so you don't have to
//worry about declaring an array and putting a line into it that is bigger than 
//it can hold. 

C-style:

char version1[7] = {'H','i',' ','y','o','u','\0'};
char version2[] = "Hi you"; //using quotes, don't need null character? added for you?
char* version3 = "Hi you";

Version3 is where I'm having trouble. Here, there is a pointer to a char. I know that an array name is a pointer to the first element in an array.

cout << " &version3: " << &version3 << endl; //prints out location of 'H'
cout << " *version3: " << *version3 << endl; //prints out 'H'
cout << "  version3: " <<  version3 << endl; //prints out the whole string up to
                                             //automatically inserted \0

Before, in the section "what I knew," printing out the name of a pointer would print out the address that it held. Here, printing out the name of a pointer prints out the whole string. Do the double quotes around "Hi you" somehow tell the program: "hey I know you are a pointer, and you are initialized to the location of 'H', but because I see these double quotes, skip forward 1 byte in memory location and printout everything you see until you reach a \\0" (1 byte movement because char's are 1 byte large).

How is it that printing out a pointer prints out a string? Before printing out a pointer name printed out the memory address it was initialized to.

Edit: Does cout << &version3 print out the location of 'H' or the location of the pointer, version3 , which holds the memory address of 'H'?

Printing out a char* with cout works differently from printing, say, an int* with cout . For compatibility with C-style strings, the overloaded version of << which takes a char* argument treats the char* as a C-style string. If you want to print the memory address that a char* holds, you can cast it to a void* .

Yes, if you write either of

char *s1 = "hi lol";
char s2[] = "hi haha";

a NUL ( \\0 ) terminator is added for you at the end of the string. The difference between these two is that s1 is a pointer to a string literal the contents of which you are not, according to the C standard, supposed to modify, whereas s2 is an array , that is, a block of memory allocated for you on the stack, which is initialized to hold the value "hi haha" , and you are free to modify its contents as you please. The amount of memory allocated for the array is exactly enough to hold the string used as the initializer, and is determined for you automatically, which is why the square brackets can be empty.

One side note: in C, if you enter char s[3] = "abc"; , then s will be initialized to {'a', 'b', 'c'} , without a NUL terminator! This is because of a clause in the standard that says that strings in this context are initialized with a NUL terminator if there is room in the array (or some similar wording). In C++ this is not the case. For more, see No compiler error when fixed size char array is initialized without enough room for null terminator .

Edit, in response to your added question: If you have char *s = "..." , and you cout << &s; , it will print the address where the pointer s is stored, rather than the address that s holds (the address of the first element of the string to which s refers).

I know that an array name is a pointer to the first element in an array.

No, it isn't. It only gets converted into one implicitly (in most cases).

Do the double quotes around "Hi you" somehow tell the program: "hey I know you are a pointer, and you are initialized to the location of 'H', but because I see these double quotes, skip forward 1 byte in memory location and printout everything you see until you reach a \\0"?

No, they don't. It's the type that matters.

std::ostream::operator<< has an overload for a generic pointer ( void * ) and an overload for const char * . So when you write cout << "some char array or pointer"; , it will invoke that overload, and not the one for other pointer types. This overload behaves differently: instead of printing the numeric value of the pointer, it prints out everything until a NUL terminator.

You are asking two questions here, one of which is more general and which I will answer first.

A pointer is a 'handle' (called an address) to a location in memory. The contents of that location is a value. How that value is interpreted is dependent on the 'type' of the pointer. Think of it this way: an address is the location of a house, for example, 10 Main Street. The contents of 10 Main Street are the contents of the house at that address. In C (and C++) terms:

int *p;

Is a variable that can hold the address of an integer.

int x;

Is a variable that IS an integer.

p = &x;

Has p 'point to' x by storing in p the ADDRESS of x.

x = 10;

Stores in x the value 10.

*p == x

Because *p says to 'use the contents of p' as a value and then compare it to x.

p == &x

Because &x says to 'use the address of x' as a value and then compare it to p, which is a variable of type pointer.

A string, which is a sequence of zero or more characters is represented in C (and C++) by the characters " and ". In memory, these values are stored consecutively and are automatically terminated by a trailing null byte, which is a byte with the value 0. The string "Hello, world!" is stored in memory, assuming you are using an 8 bit ASCII character encoding, as:

65 101 108 108 111 44 32 87 111 114 108 33 0

You can see this for yourself using the following code snippet:

char *p = "Hello, World!";
int len = strlen(p);
int i;

for (i = 0; i <= len;  ++i)
{
    std::cout << std::dec << (int) p[i] << ' ';
}
std::cout << std::endl;

Since the null (zero byte) is automatically added to a constant string by the compiler, the following two arrays are the same size:

char a1[7] = { 'H', 'i', ' ', 'y', 'o', 'u', '\0' };
char a2[] = "Hi you";

std::cout << strcmp(a1, a2) << std::endl

Simply put, a 'C style string' is simply an array of characters terminated by a null, which answers your first question.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM