简体   繁体   English

VS2010 C ++ / C#编译器能否优化掉循环内部声明的变量?

[英]Can a VS2010 C++/C# compiler optimize away variables declared inside of the loop?

I am pretty new at my place, so I should think twice before voicing concerns, but some of the code I have seen ... 我在我的位置上很新,所以在表达担忧之前我应该​​三思而后行,但我看到的一些代码......

When I try to bring up readability, I am told that there is not always time for that, that efficiency is far more important. 当我试图提高可读性时,我被告知并不总是有时间,效率更重要。

But then I see variable redeclaration inside of different types of loops, sometimes down to two levels. 但后来我看到不同类型的循环中的变量重新声明,有时下降到两个级别。 Part of me thinks - do not ever do that! 我的一部分认为 - 不要那样做! But the other part says - this complicated function should be broken down into several functions anyway. 但另一部分说 - 无论如何,这个复杂的功能应该分解成几个功能。 Those smaller functions can have temp variables, and a compiler should be able to take care of them. 那些较小的函数可以有临时变量,编译器应该能够处理它们。

Then the function calls add some additional cost to it. 然后函数调用会增加一些额外的成本。 Let me try to come up with 2 examples: 让我试着想出两个例子:

Class1::Do1()
{
    for (int i = 0; i < 100; i++)
    {
        bool x = GetSomeValue();
        ...
        if (x)
        {
            ...
        }
    } 
}

vs VS

Class1::Do1()
{
    bool x = false;
    for (int i = 0; i < 100; i++)
    {
        x = GetSomeValue();
        ...
        if (x)
        {
            ...
        }
    } 
}

vs VS

Class1::Do1()
{
    for (int i = 0; i < 100; i++)
    {
        Do2();
    } 
}

Class1::Do2()
{
    bool x = GetSomeValue();
    ...
    if (x)
    {
        ...
    }
}

The first way looks wrong to me, I always prefer second or even third when I write the code myself. 第一种方式对我来说是错误的,当我自己编写代码时,我总是喜欢第二种甚至第三种。 I think that the third way might be even slower due to extra function calls. 我认为由于额外的函数调用,第三种方式可能会更慢。 The first way might even look sketchy at times - in case the function is long, the declaration will be a number of lines away from where it is used. 第一种方式有时甚至可能看起来粗略 - 如果函数很长,声明将远离使用它的位置。 The other thing is that my example is too simple - the compiler could probably figure out how to simplify, and perhaps inline all 3. Unfortunately right now I cannot recall other examples of what I think is sloppiness, just want to mention that some variables are redeclared n*m times, because they are two levels deep (within 2 loops). 另一件事是我的例子太简单了 - 编译器可能弄清楚如何简化,也许可以内联所有3.不幸的是现在我不记得其他我认为是邋iness的例子,只是想提一下变量是重新声明n * m次,因为它们是两级深度(在2个循环内)。

The devil's advocate says - how do you know 100% that this might not be efficient? 魔鬼的拥护者说 - 你怎么知道100%这可能效率不高? The purist (my version of it) in me thinks that it is stupid to redeclare the same variable over and over and over - at the very least it throws one off when reading the code. 我的纯粹主义者(我的版本)认为一遍又一遍地重新声明同一个变量是愚蠢的 - 至少在阅读代码时它会抛出一个。

Thoughts? 思考? Questions? 有问题吗?

As far as I recall all local variables are assigned stack space at the beginning of the method call, so it should not matter whether you declare the variable within the loop or before it. 据我所知,所有局部变量都在方法调用开始时分配了堆栈空间,因此无论是在循环内还是在循环之前声明变量都无关紧要。

That being the case I would code for readability - if you don't need the variable outside the loop I personally would declare it in the loop, to keep it closer to the code that actually uses it and reducing the scope of the variable as much as possible. 在这种情况下,我会编写可读性代码 - 如果你不需要循环外的变量,我个人会在循环中声明它,以使它更接近实际使用它的代码并减少变量的范围尽可能。

Yes, this is one of the most basic optimizations the compiler can perform, when it is applicable . 是的,这是编译器在适用时可以执行的最基本的优化之一。

Of course, the compiler can only do this when it doesn't alter the semantics of the program. 当然,只有在不改变程序语义的情况下,编译器才能执行此操作。

However, there's another important aspect you're missing. 但是,您还缺少另一个重要方面。 You're assuming that there is a cost to declaring the variable inside the loop. 您假设在循环内声明变量是有成本的。

How much time do you think it takes to declare a variable of type bool ? 您认为声明bool类型的变量需要多长时间? It's essentially free. 它基本上是免费的。 The compiler doesn't have to do anything, other than increment the stack pointer (which it would have to do regardless of where in the function the variable was declared), and assign a value to it (which in your example happens inside the loop in any case). 编译器不需要做任何事情,除了递增堆栈指针(无论函数在哪里声明变量,都必须这样做),并为它赋值(在你的例子中发生在循环内部)任何状况之下)。

So the difference in performance is zero . 所以性能差异为零 This is the kind of purely mechanical optimizations that the compiler excels at. 这是编译器擅长的纯机械优化。 It knows how to allocate memory on the stack, and it knows the exact cost of doing this, and of initializing or assigning to a variable. 它知道如何在堆栈上分配内存,并且知道执行此操作以及初始化或分配给变量的确切成本。

Whenever possible, you should declare your variables in the smallest possible scope. 只要有可能,您应该在尽可能小的范围内声明变量。 Declare them when you need them, and no sooner. 在你需要的时候宣布它们,并且不久。 So your first approach is yucky, and it's not even any faster, so it's basically just a bad idea. 所以你的第一种方法是令人讨厌的,它甚至没有更快,所以它基本上只是一个坏主意。

In C#, there's never really any downside to this. 在C#中,从来没有真正的任何缺点。 Either the variable is a value type, in which case allocating it is free (the same stack space is reused over the entire loop, so there's no cost to declaring the variable inside the loop), or it's a reference variable (in which case you're just putting a reference on the stack, which is free for the same reason) 变量是一个值类型,在这种情况下分配它是免费的(相同的堆栈空间在整个循环中重用,所以在循环中声明变量没有成本),或者它是一个引用变量(在这种情况下你'只是在堆栈上放置一个引用,出于同样的原因是免费的)

In C++, there are cases where this won't work, or where there's a real cost to it. 在C ++中,有些情况下它不起作用,或者存在实际成本的情况。 The variable might, in its constructor, perform some expensive operation that can't be optimized away, and then that will have to be executed on every iteration if the variable is declared inside the loop. 变量可能在其构造函数中执行一些无法优化的昂贵操作,然后如果变量在循环内声明,则必须在每次迭代时执行。 The compiler can't move the variable outside the loop, because it has to be initialized in every iteration of the loop: that's what you, the programmer, specified, and the only way to achieve this is to call the constructor. 编译器不能在循环外部移动变量,因为它必须在循环的每次迭代中初始化:这是程序员指定的,而实现这一点的唯一方法是调用构造函数。

Then, it may be worth it to move the object outside the loop. 然后,将对象移出循环可能是值得的。 (but then of course, the assignment operator, instead of the constructor, is executed every iteration, so hopefully that is a cheaper operation) (但当然,赋值运算符,而不是构造函数,每次迭代都会执行,所以希望这是一个更便宜的操作)

The code generated for 1 & 2 will be identical. 为1和2生成的代码将是相同的。 It doesn't matter where you declare variables. 声明变量的位置无关紧要。 Here's an example (with Console.WriteLine("Yes") in the if statement): 这是一个示例(在if语句中使用Console.WriteLine("Yes") ):

.maxstack 2
.locals init (
    [0] bool x,
    [1] int32 i)
L_0000: ldc.i4.0 
L_0001: stloc.1 
L_0002: br.s L_001b
L_0004: call bool ConsoleApplication8.Program::GetSomeValue()
L_0009: stloc.0 
L_000a: ldloc.0 
L_000b: brfalse.s L_0017
L_000d: ldstr "Yes"
L_0012: call void [mscorlib]System.Console::WriteLine(string)
L_0017: ldloc.1 
L_0018: ldc.i4.1 
L_0019: add 
L_001a: stloc.1 
L_001b: ldloc.1 
L_001c: ldc.i4.s 100
L_001e: blt.s L_0004
L_0020: ret 

Stylistically, however, I'd say to declare your variables as close as possible to where you use them. 但是,从风格上讲,我会说你的变量尽可能接近你使用它们的位置。

In terms of performance, more important for these examples would be whether or not the GetSomeValue() call can be moved outside the loop (ie is it invariant throughout the loop). 在性能方面,对于这些示例更重要的是GetSomeValue()调用是否可以在循环外移动(即它是否在整个循环中不变)。 In some cases the compiler can detect this itself. 在某些情况下,编译器可以自己检测它。

#3 will depend on whether or not the function Do2() is inlined into the loop. #3将取决于函数Do2()是否内联到循环中。 If it is, it will probably result in the exact same generated code (not MSIL) in the end. 如果是,它可能最终导致完全相同的生成代码(不是MSIL)。

What you do, is you stop trying to be smarter than your tools and you just use them! 你做的是,你是不是试图比你的工具更聪明,你只是使用它们! Get a profiler, profile your code, and MEASURE performance!!! 获取一个分析器,分析您的代码,并测量性能! Then, AND ONLY THEN, you will know if it's even something you need to concern yourself with. 然后,只有那样,你会知道它是否是你需要关心的事情。

I know, it just sounds so darn counter intuitive to think that maybe, just maybe, you can't possibly know every damn thing your compiler may or may not optimize away. 我知道,它只是听起来非常直观地认为,也许,只是也许,你不可能知道你的编译器可能会或可能不会优化的每件事。 Granted, it seems to be more common for developers, experienced ones included, to think they know what they CAN'T know and then write huge mazes of unmaintainable, and unoptimizable, blobs of code all in the name of efficiency. 当然,开发人员(包括经验丰富的人)似乎更常见的是认为他们知道他们不知道的东西,然后以效率的名义编写巨大的不可维护,无法优化的代码块。 What they SHOULD be doing though is making sure the code is maintainable FIRST and THEN figure out what bottlenecks, if any, need to be spread open. 他们应该做的是确保代码是可维护的,然后确定哪些瓶颈(如果有的话)需要展开。

Maintainability is usually more important, but this particular example is not a capital offense. 可维护性通常更重要,但这个特殊的例子不是死罪。

  • What I think is funny / typical is that you're being told it's more efficient. 我认为有趣/典型的是你被告知它更有效率。

If performance is seriously important, then performance should be dealt with. 如果表现非常重要,那么应该处理表现。
Here's an example of the method I use. 这是我使用的方法的一个例子。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM