[英]Why one sample of code take more time to execute than other?
sample 1 样本1
for(int i = 0 ; i <= 99 ; i++)
printf("Hello world");
sample 2 样本2
printf("Hello world"); // 1st print
printf("Hello world"); // 2nd print
.
.
.
printf("Hello world"); // 100th print
I know that sample one takes more time to execute than sample 2 and sample 2 takes more memory in text segment. 我知道示例1比示例2需要更多的时间来执行,示例2在文本段中需要更多的内存。
But, I want to know that what's going on behind the scene. 但是,我想知道幕后发生的事情。
Imagine sample one being written as this sequence of operations: 想象一下这样编写的示例操作序列:
i = 0
if (i <= 99)
print
i++
jump
if (i <= 99)
print
i++
jump
if (i <= 99)
print
i++
jump
...
While the second sample is simply: 第二个示例很简单:
print
print
print
print
...
This is extremely simplified, but you should get the idea - the first sample executes many more instructions to go through the loop. 这是极其简化的,但是您应该明白这一点-第一个示例执行了更多的指令来遍历循环。
As a side note - this is one of the optimizations the compiler will frequently do - it will unroll the loop and compile it as if there was no loop. 附带说明-这是编译器经常进行的优化之一-它将展开循环并像没有循环一样对其进行编译。 To do that, it has to come to the conclusion it is worth while - note that sample two will compile into much greater total number of instructions and will take much more space in memory (and therefore will take longer to load). 为此,必须得出值得的结论-请注意,示例二将编译成更多的指令总数,并且将占用更多的内存空间(因此将花费更长的时间加载)。
The code at sample 2 can be quicker if programmed properly. 如果编程正确,示例2中的代码可以更快。
As you have described, there are 100 calls to printf("... "); 如您所描述的,有100个对printf(“ ...”)的调用; with the same string as parameter. 与参数相同的字符串。 If the compiler is an optimizing compiler, it can detect you are passing exactly the same parameter and don't pop the pointer after the call, so it won't need to push it again for the next call. 如果编译器是优化的编译器,则它可以检测到您传递的参数完全相同,并且在调用之后不弹出指针,因此无需在下次调用时再次将其推入。
Also, the difference in speed between the loop is the time spent in jumping back to the beginning of the loop. 同样,循环之间速度的差异是返回到循环开始所花费的时间。 With present architectures, that can be even an advantage, as the whole loop code is cached by the CPU (this cannot be done with a large set of similar calls) and no memory access is to be made to get the instructions loaded, compensating for the time spent in executing the loop instructions. 在当前的架构下,这甚至可能是一个优势,因为整个循环代码都由CPU缓存(这不能通过大量类似的调用集来完成),并且无需进行任何内存访问即可加载指令,以补偿执行循环指令所花费的时间。
But... even, with a good optimizing compiler, it can detect you have put the same sentence 100 times and fold'em in a loop, with a hidden control variable (as in sample 1) so you don't se a difference in time on execution. 但是...即使,即使是使用良好的优化编译器,它也可以检测到您将相同的句子放置了100次并用隐藏的控制变量(如示例1)将fold'em循环放入了循环中,因此您不会有任何区别及时执行。
Optimizing compilers are used to detect these kind of constructions and to change the code to be more efficient. 优化的编译器用于检测此类构造并更改代码以提高效率。
A good reference for this kind of material is this: http://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools 这种材料的一个很好的参考是: http : //en.wikipedia.org/wiki/Compilers : _Principles,_Techniques,_and_Tools
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.