简体   繁体   中英

Malloc altering behavior of uninitialized variable in separate function?

This is a question for my Programming Langs Concepts/Implementation class. Given the following C code snippet:

void foo()  
{ 
    int i; 
    printf("%d ", i++); 
} 
void main()  
{ 
    int j; 
    for (j = 1; j <= 10; j++)  
        foo(); 
}

The local variable i in foo is never initialized but behaves similarly to a static variable on most systems. Meaning the program will print 0 1 2 3 4 5 6 7 8 9 . I understand why it does this (the memory location of i never changes) but the question in the homework asks to modify the code (without changing foo ) to alter this behavior. I've come up with a solution that works and makes the program print ten 0 's but I don't know if it's the "right" solution and to be honest I don't exactly know why it works.

Here is my solution:

void main()  
{ 
    int j; 
    void* some_ptr = NULL;
    for (j = 1; j <= 10; j++)
    {
        some_ptr = malloc(sizeof(void*));
        foo();
        free(some_ptr);
    }
}

My original thought process was that i wasn't changing locations because there was no other memory manipulation happening around the calls of foo , so allocating a variable should disrupt that, but ince some_ptr is allocated in the heap and i is on the stack, shouldn't the allocation of some_ptr have no effect on i ? My thought is that the compiler is playing some games with the optimization of that subroutine call, could anyone clarify?

There cannot be a "right" solution. But there can be a class of solutions which work for a particular CPU architecture, ABI, compiler, and compiler options.

Changing the code to something like this will have the effect of altering the memory above the stack in a way which should affect many, if not most, environments in the targeted way.

void foo()  
{ 
    int i; 
    printf("%d ", i++); 
} 
void main()  
{ 
    int j;
    int a [2];

    for (j = 1; j <= 10; j++)
    {
        foo();
        a [-5] = j * 100;
    }
}

Output (gcc x64 on Linux):

0 100 200 300 400 500 600 700 800 900 

a[-5] is the number of words of stack used for overhead and variables spanning the two functions. There is the return address, saved stack link value, etc. The stack likely looks like this when foo() writes to a[-5]:

i
saved stack link
return address
main's j
(must be something else)
main's a[]

I guessed -5 on the second try. -4 was my first guess.

When you call foo() from main() , the (uninitialized) variable i is allocated at a memory address. In the original code, it so happens that it is zero (on your machine, with your compiler, and your chosen compilation options, your environment settings, and given the current phase of the moon — it might change when any of these, or a myriad other factors, changes).

By calling another function before calling foo() , you allow the other function to overwrite the memory location that foo() will use for i with a different value. It isn't guaranteed to change; you could, by bad luck, replace the zero with another zero.

You could perhaps use another function:

static void bar(void)
{
    int j;
    for (j = 10; j < 20; j++)
        printf("%d\n", j);
}

and calling that before calling foo() will change the value in i . Calling malloc() changes things too. Calling pretty much any function will probably change it.

However, it must be (re)emphasized that the original code is laden with undefined behaviour, and calling other functions doesn't make it any less undefined. Anything can happen and it is valid.

The variable i in foo is simply uninitialized, and uninitialized value have indeterminate value upon entering the block. The way you saw it print certain value is entirely by coincident, and to write standard conforming C, you should never rely on such behavior. You should always initialize automatic variables before using it.

From c11std 6.2.4p6:

For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.

The reason the uninitialized value seems to keep its value from past calls is that it is on the stack and the stack pointer happens to have the same value every time the function is called.

The reason your code might be changing the value is that you started calling other functions: malloc and free . Their internal stack variables are using the same location as i in foo() .

As for optimization, small programs like this are in danger of disappearing entirely. GCC or Clang might decide that since using an uninitialized variable is undefined behavior, the compiler is within its rights to completely remove the code. Or it might put i in a register set to zero. Then decide all printf calls output zero. Then decide that your entire program is simply a single puts("0000000000") call.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM