简体   繁体   中英

Why does initializing C local character arrays internally store the strings in different stack/data segments?

While working on some position-independent C injected shellcode, the strings were initially coded using this array initialization

char winexec[] = "WinExec";

However, this caused the shellcode to fail because the string WinExec was stored in the data segment of the injector but the injectee did not have access to that data.

To fix, the array initialization was changed to

char winexec[] = { 'W','i','n','E','x','e','c','\0' };

which worked perfectly because the string was stored in the injectee local stack segment.

For example https://godbolt.org/z/v8cqn5E56

#include <stdio.h>

int main()
{
    /* String stored in the stack segment */
    char winexecStack[] = { 'W','i','n','E','x','e','c','\0' };

    /* String stored in the data segment */
    char winexecData[] = "WinExec";
    
    printf("Stack Segment: %s\n", winexecStack);
    printf("Data Segment:  %s\n", winexecData);     
    
    return 0;
}

Question

Why does C have multiple ways to initialize local arrays which externally appear the same, but internally the strings are stored very differently?

Do tidier methods exist to initialize a C character array on the stack? Maybe something like

char winexecStack[8];
winexecStack[0] = 'W';
winexecStack[1] = 'i';
winexecStack[2] = 'n';
winexecStack[3] = 'E';
winexecStack[4] = 'x';
winexecStack[5] = 'e';
winexecStack[6] = 'c';
winexecStack[7] = '\0';

or convert strings such as Hello, World! to little endian values in an array

unsigned long long hello[] = { 0x57202C6F6C6C6548,0x00000021646C726F };
printf("Stack Segment: %s\n", (char*)&hello);

Perhaps for strings <= 8 bytes, they could be represented as a numerical value, stored on the stack but treated as a char* for example "WinExec"

unsigned long long winexec = 0x00636578456e6957;
printf("Stack Segment: %s\n", (char*)&winexec);

Why does C have multiple ways to initialize local arrays which externally appear the same, but internally the strings are stored very differently?

It doesn't. That you observe the source data for the initializers to be stored differently in the two cases is a function of your C implementation. It is not required by the C language itself. More generally, C has a lot to say about what is stored, but less to say about how it is stored, and almost nothing to say about where it is stored.

Do tidier methods exist to initialize a C character array on the stack?

A valid character array initializer takes one of the two forms you show.

Note also that "on the stack" is not a C concept (refer to "almost nothing to say about where ").

Turning on optimization with /O2 makes the difference vanish. This suggests that, without optimization, the compiler implements C somewhat literally, putting the array induced by a string literal in a data segment (for static storage) while individual character initializers are treated as small constants. With optimization turned on, the compiler performs deeper semantic analysis and optimizes the generated code, and in fact the constant proposed in the question, 0x00636578456e6957, is seen in the generated assembly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM