简体   繁体   中英

Can a VS2010 C++/C# compiler optimize away variables declared inside of the loop?

I am pretty new at my place, so I should think twice before voicing concerns, but some of the code I have seen ...

When I try to bring up readability, I am told that there is not always time for that, that efficiency is far more important.

But then I see variable redeclaration inside of different types of loops, sometimes down to two levels. Part of me thinks - do not ever do that! But the other part says - this complicated function should be broken down into several functions anyway. Those smaller functions can have temp variables, and a compiler should be able to take care of them.

Then the function calls add some additional cost to it. Let me try to come up with 2 examples:

Class1::Do1()
{
    for (int i = 0; i < 100; i++)
    {
        bool x = GetSomeValue();
        ...
        if (x)
        {
            ...
        }
    } 
}

vs

Class1::Do1()
{
    bool x = false;
    for (int i = 0; i < 100; i++)
    {
        x = GetSomeValue();
        ...
        if (x)
        {
            ...
        }
    } 
}

vs

Class1::Do1()
{
    for (int i = 0; i < 100; i++)
    {
        Do2();
    } 
}

Class1::Do2()
{
    bool x = GetSomeValue();
    ...
    if (x)
    {
        ...
    }
}

The first way looks wrong to me, I always prefer second or even third when I write the code myself. I think that the third way might be even slower due to extra function calls. The first way might even look sketchy at times - in case the function is long, the declaration will be a number of lines away from where it is used. The other thing is that my example is too simple - the compiler could probably figure out how to simplify, and perhaps inline all 3. Unfortunately right now I cannot recall other examples of what I think is sloppiness, just want to mention that some variables are redeclared n*m times, because they are two levels deep (within 2 loops).

The devil's advocate says - how do you know 100% that this might not be efficient? The purist (my version of it) in me thinks that it is stupid to redeclare the same variable over and over and over - at the very least it throws one off when reading the code.

Thoughts? Questions?

As far as I recall all local variables are assigned stack space at the beginning of the method call, so it should not matter whether you declare the variable within the loop or before it.

That being the case I would code for readability - if you don't need the variable outside the loop I personally would declare it in the loop, to keep it closer to the code that actually uses it and reducing the scope of the variable as much as possible.

Yes, this is one of the most basic optimizations the compiler can perform, when it is applicable .

Of course, the compiler can only do this when it doesn't alter the semantics of the program.

However, there's another important aspect you're missing. You're assuming that there is a cost to declaring the variable inside the loop.

How much time do you think it takes to declare a variable of type bool ? It's essentially free. The compiler doesn't have to do anything, other than increment the stack pointer (which it would have to do regardless of where in the function the variable was declared), and assign a value to it (which in your example happens inside the loop in any case).

So the difference in performance is zero . This is the kind of purely mechanical optimizations that the compiler excels at. It knows how to allocate memory on the stack, and it knows the exact cost of doing this, and of initializing or assigning to a variable.

Whenever possible, you should declare your variables in the smallest possible scope. Declare them when you need them, and no sooner. So your first approach is yucky, and it's not even any faster, so it's basically just a bad idea.

In C#, there's never really any downside to this. Either the variable is a value type, in which case allocating it is free (the same stack space is reused over the entire loop, so there's no cost to declaring the variable inside the loop), or it's a reference variable (in which case you're just putting a reference on the stack, which is free for the same reason)

In C++, there are cases where this won't work, or where there's a real cost to it. The variable might, in its constructor, perform some expensive operation that can't be optimized away, and then that will have to be executed on every iteration if the variable is declared inside the loop. The compiler can't move the variable outside the loop, because it has to be initialized in every iteration of the loop: that's what you, the programmer, specified, and the only way to achieve this is to call the constructor.

Then, it may be worth it to move the object outside the loop. (but then of course, the assignment operator, instead of the constructor, is executed every iteration, so hopefully that is a cheaper operation)

The code generated for 1 & 2 will be identical. It doesn't matter where you declare variables. Here's an example (with Console.WriteLine("Yes") in the if statement):

.maxstack 2
.locals init (
    [0] bool x,
    [1] int32 i)
L_0000: ldc.i4.0 
L_0001: stloc.1 
L_0002: br.s L_001b
L_0004: call bool ConsoleApplication8.Program::GetSomeValue()
L_0009: stloc.0 
L_000a: ldloc.0 
L_000b: brfalse.s L_0017
L_000d: ldstr "Yes"
L_0012: call void [mscorlib]System.Console::WriteLine(string)
L_0017: ldloc.1 
L_0018: ldc.i4.1 
L_0019: add 
L_001a: stloc.1 
L_001b: ldloc.1 
L_001c: ldc.i4.s 100
L_001e: blt.s L_0004
L_0020: ret 

Stylistically, however, I'd say to declare your variables as close as possible to where you use them.

In terms of performance, more important for these examples would be whether or not the GetSomeValue() call can be moved outside the loop (ie is it invariant throughout the loop). In some cases the compiler can detect this itself.

#3 will depend on whether or not the function Do2() is inlined into the loop. If it is, it will probably result in the exact same generated code (not MSIL) in the end.

What you do, is you stop trying to be smarter than your tools and you just use them! Get a profiler, profile your code, and MEASURE performance!!! Then, AND ONLY THEN, you will know if it's even something you need to concern yourself with.

I know, it just sounds so darn counter intuitive to think that maybe, just maybe, you can't possibly know every damn thing your compiler may or may not optimize away. Granted, it seems to be more common for developers, experienced ones included, to think they know what they CAN'T know and then write huge mazes of unmaintainable, and unoptimizable, blobs of code all in the name of efficiency. What they SHOULD be doing though is making sure the code is maintainable FIRST and THEN figure out what bottlenecks, if any, need to be spread open.

Maintainability is usually more important, but this particular example is not a capital offense.

  • What I think is funny / typical is that you're being told it's more efficient.

If performance is seriously important, then performance should be dealt with.
Here's an example of the method I use.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM