In the following code snippet
char *str1 = "abcd";
char str2[] = "defg";
I realize that the first statement stores the pointer to a string literal in the readonly section of the executable while the second one to a read write section. On examining the generated instructions I verify that the first one stores the pointer to "abcd" in rodata section to str1.
What was interesting was the second statement. The compiler inserted code to store values into
char *str1 = "abcd";
8048420: c7 44 24 10 20 85 04 movl $0x8048520,0x10(%esp)
8048427: 08
char str2[] = "defg";
8048428: c7 44 24 17 64 65 66 movl $0x67666564,0x17(%esp)
804842f: 67
8048430: c6 44 24 1b 00 movb $0x0,0x1b(%esp)
How does the compiler decide when to do which out of the following?
Note: I am running an precise32 vagrant, gcc with debug symbols and -O0
When an aggregate object in memory is initialized with a compile-time aggregate value (which is not limited to string literals), the compiler always has a choice
Pre-build the complete initializer in read-only data section at compile time, and then just copy the whole thing into the modifiable target value by using memcpy
at run time.
Generate code that will directly build the target value "in-place" piece-by-piece at run time.
Basically, the first is the "data-based" approach and the second is the "code-based" approach. In your case the compiler uses code-based solution, probably because the literal is short. Use a longer literal and, I suspect, it will eventually switch to the first approach.
One can probably imagine that in some cases a mixed approach might be used by some compiler: part of the data is pre-build somewhere and memcpy
-ed from there, the rest of the data is built on the fly.
If your
char str2[] = "defg";
definition is inside a function, then the compiler will generate instructions to put the data on the stack (ignoring possible optimizations, eg keeping values purely in registers). This works just as for other automatic (stack) variables.
It also has the option of copying the data from somewhere else to the stack instead of eg having the data values as immediate operands to instructions. It might choose to do this for longer strings to avoid code bloat.
Regardless of what the compiler does, modifications to the contents of str2
must not be visible by the next invocation of the function though (just as for other automatic variables).
If str2
is global (which gives it static storage duration), then the data will end up in the read/write data segment. This also happens if you give the array static storage duration inside the function, as in
static char str2[] = "defg";
When initiliazing a pointer with a string literal, as in
char *s = "defg";
, the data ends up in the read-only data segment, and the rules for how the pointer itself is initialized with the address of the data are the same as above.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.