简体   繁体   中英

How is static variable initialization implemented by the compiler?

I'm curious about the underlying implementation of static variables within a function.

If I declare a static variable of a fundamental type (char, int, double, etc.), and give it an initial value, I imagine that the compiler simply sets the value of that variable at the very beginning of the program before main() is called:

void SomeFunction();

int main(int argCount, char ** argList)
{
    // at this point, the memory reserved for 'answer'
    // already contains the value of 42
    SomeFunction();
}

void SomeFunction()
{
    static int answer = 42;
}

However, if the static variable is an instance of a class:

class MyClass
{
    //...
};

void SomeFunction();

int main(int argCount, char ** argList)
{
    SomeFunction();
}

void SomeFunction()
{
    static MyClass myVar;
}

I know that it will not be initialized until the first time that the function is called. Since the compiler has no way of knowing when the function will be called for the first time, how does it produce this behavior? Does it essentially introduce an if-block into the function body?

static bool initialized = 0;
if (!initialized)
{
    // construct myVar
    initialized = 1;
}

In the compiler output I have seen, function local static variables are initialized exactly as you imagine.

Note that in general this is not done in a thread-safe manner. So if you have functions with static locals like that that might be called from multiple threads, you should take this into account. Calling the function once in the main thread before any others are called will usually do the trick.

I should add that if the initialization of the local static is by a simple constant like in your example, the compiler doesn't need to go through these gyrations - it can just initialize the variable in the image or before main() like a regular static initialization (because your program wouldn't be able to tell the difference). But if you initialize it with a function's return value, then the compiler pretty much has to test a flag indicating if the initialization has been done or something equivalent.

This question covered similar ground, but thread safety wasn't mentioned. For what it's worth, C++0x will make function static initialisation thread safe.

(see the C++0x FCD , 6.7/4 on function statics: "If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization.")

One other thing that hasn't been mentioned is that function statics are destructed in reverse order of their construction, so the compiler maintains a list of destructors to call on shutdown (this may or may not be the same list that atexit uses).

You're right about everything, including the initialized flag as a common implementation. This is basically why initialization of static locals is not thread-safe, and why pthread_once exists.

One slight caveat: the compiler must emit code which "behaves as if" the static local variable is constructed the first time it is used. Since integer initialization has no side effects (and calls no user code), it's up to the compiler when it initializes the int. User code cannot "legitimately" find out what it does.

Obviously you can look at the assembly code, or provoke undefined behaviour and make deductions from what actually happens. But the C++ standard doesn't count that as valid grounds to claim that the behaviour is not "as if" it did what the spec says.

I know that it will not be initialized until the first time that the function is called. Since the compiler has no way of knowing when the function will be called for the first time, how does it produce this behavior? Does it essentially introduce an if-block into the function body?

Yes, that's right: and, FWIW, it's not necessarily thread-safe (if the function is called "for the first time" by two threads simultaneously).

For that reason you might prefer to define the variable at global scope (although maybe in a class or namespace, or static without external linkage) instead of inside a function, so that it's initialized before the program starts without any run-time "if".

Another twist is in embedded code, where the run-before-main() code (cinit/whatever) may copy pre-initialized data (both statics and non-statics) into ram from a const data segment, perhaps residing in ROM. This is useful where the code may not be running from some sort of backing store (disk) where it can be re-loaded from. Again, this doesn't violate the requirements of the language, since this is done before main().

Slight tangent: While I've not seen it done much (outside of Emacs), a program or compiler could basically run your code in a process and instantiate/initialize objects, then freeze and dump the process. Emacs does something similar to this to load up large amounts of elisp (ie chew on it), then dump the running state as the working executable, to avoid the cost of parsing on each invocation.

The relevant thing isn't being a class type or not, it's compile-time evaluation of the initializer (at the current optimization level). And of course the constructor not having any side-effects, if it's non-trivial.

If it's not possible to simply put a constant value in .data , gcc/clang use an acquire load of a guard variable to check that static locals have been initialized. If the guard variable is false, then they pick one thread to do the initializing, and have other threads wait for it if they also see a false guard variable. They've been doing this for a long time, since before C++11 required it. (eg as old as GCC4.1 on Godbolt, from May 2006.)

The most simple artificial example, snapshotting the arg from the first call and ignoring later args:

int foo(int a){
    static int x = a;
    return x;
}

Compiles for x86-64 with GCC11.3 -O3 ( Godbolt ), with the exact same asm generated for -std=gnu++03 mode. GCC4.1 also makes about the same asm, but doesn't keep the push/pop off the fast path (ie missing shrink-wrap optimization). GCC4.1 only supported AT&T syntax output, so it visually looks different unless you flip modern GCC to AT&T mode as well, but this is Intel syntax (destination on the left).

# demangled asm from g++ -O3
foo(int):
        movzx   eax, BYTE PTR guard variable for foo(int)::x[rip]  # guard.load(acquire)
        test    al, al
        je      .L13
        mov     eax, DWORD PTR foo(int)::x[rip]    # normal load of the static local
        ret              # fast path through the function is the already-initialized case


.L13:            # jumps here on guard == 0, on the first call (and any that race with it)
                 # It would be sensible for GCC to put this code in .text.cold
        push    rbx
        mov     ebx, edi             # save function arg in a call-preserved reg
        mov     edi, OFFSET FLAT:guard variable for foo(int)::x  # address
        call    __cxa_guard_acquire          # guard_acquire(&guard_x) presumably a normal mutex or spinlock
        test    eax, eax 
        jne     .L14                         # if (we won the race to do the init work) goto .L14
        mov     eax, DWORD PTR foo(int)::x[rip]  # else it's done now by another thread
        pop     rbx
        ret
.L14:
        mov     edi, OFFSET FLAT:guard variable for foo(int)::x
        mov     DWORD PTR foo(int)::x[rip], ebx       # init static x (from a saved in RBX)
        call    __cxa_guard_release
        mov     eax, DWORD PTR foo(int)::x[rip]       # missed optimization:  mov eax, ebx  
                # This thread is the one that just initialized it, our function arg is the value. 
                # It's not atomic (or volatile), so another thread can't have set it, too.
        pop     rbx
        ret

If compiling for AArch64, the load of the guard variable is ldarb w8, [x8] , a load with acquire semantics. Other ISAs might need a plain load and then a barrier to give at least LoadLoad ordering, to make sure they load the payload x no earlier than when they saw the guard variable being non-zero.


If the static variable has a constant initializer, no guard is needed

int bar(int a){
    static int x = 1;
    return ++x + a;
}
bar(int):
        mov     eax, DWORD PTR bar(int)::x[rip]
        add     eax, 1
        mov     DWORD PTR bar(int)::x[rip], eax   # store the updated value
        add     eax, edi                          # and add it to the function arg
        ret

.section .data

bar(int)::x:
        .long   1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM