简体   繁体   English

问题一:调用函数的执行时间

[英]Question 1: execution time of calling the function

If most of the calls to function foo() below pass one of 10 particular values, which method will significantly reduce the execution time of calling the function?如果下面对函数 foo() 的大多数调用都传递了 10 个特定值之一,那么哪种方法将显着减少调用该函数的执行时间?

which choices are correct?哪些选择是正确的? I think Choice D but I am uncertain.我认为选择 D 但我不确定。 Experts.专家。 Thoughts?想法?

A. Replace * with an if-else block testing for the 10 values, assigning r accordingly A. 用 if-else 块测试替换 * 10 个值,相应地分配 r
B. Remove inline B. 删除内联
C. Delete foo() and move the time-consuming operation to its caller C. 删除 foo() 并将耗时的操作移至其调用者
D. Replace * with code performing a table lookup, with the 10 values and corresponding values of r D. 用执行查表的代码替换 *,用 10 个值和 r 的对应值
E. Replace * with swtich ? E. 将 * 替换为 swtich ?

1 inline int foo (int x) {
2   int r;
3 
4   * // time-consuming operation on x, result stored in r
5 
6   return r;    
7 }

B will not have any effect. B 不会有任何影响。 inline only suppresses the one-definition rule; inline只抑制单一定义规则; it does not force the compiler to inline the function.它不会强制编译器内联函数。

C is unlikely to have any impact; C 不太可能有任何影响; if the compiler determines that the function is a good candidate to be inlined, it will do so.如果编译器确定该函数很适合内联,它就会这样做。 Manually inlining it could make performance worse.手动内联它可能会使性能变差。

The other three options (A, D, and E) may all perform better or worse than each other depending on many factors.其他三个选项(A、D 和 E)可能都比彼此表现更好或更差,这取决于许多因素。 The biggest factor in all of this is the compiler.所有这一切的最大因素是编译器。 Modern compilers are very good at optimization.现代编译器非常擅长优化。 A, D, and E could all be trivially transformed into each other. A、D 和 E 都可以简单地相互转换。 Therefore, they might all be just as fast as each other.因此,它们可能都和彼此一样快。

The answer is therefore highly dependent on the specific compiler (and version of that compiler) as well as the compilation flags being used.因此,答案高度依赖于特定的编译器(以及该编译器的版本)以及所使用的编译标志。 Given a specific compiler, I would need to properly benchmark each option with optimizations turned all the way up in order to determine the correct answer.给定一个特定的编译器,我需要正确地对每个选项进行基准测试,优化一直向上,以确定正确的答案。

If I were taking this test, I would refuse to answer this question and send a note to the proctor/author indicating that the question is defective.如果我参加这个考试,我会拒绝回答这个问题,并向监考人/作者发送一张便条,表明该问题有缺陷。


Now that I have that out of the way, if we assume all compiler optimizations are disabled , D is likely to be the fastest simply because it is branchless.现在我已经解决了这个问题,如果我们假设所有编译器优化都被禁用,那么 D 可能是最快的,因为它是无分支的。 A and E both involve branching, and a failed branch prediction is costly. A 和 E 都涉及分支,失败的分支预测代价高昂。

I would expect D to be the fastest.我希望 D 是最快的。 A and E should perform about the same. A 和 E 的性能应该差不多。


In my tests on gcc with -O3 , E is optimized to a lookup table (like D) but A remains a series of conditional jumps.在我使用-O3对 gcc 进行的测试中,E 被优化为查找表(如 D),但 A 仍然是一系列条件跳转。 So in this particular test, D and E are both the correct answer.所以在这个特定的测试中,D 和 E 都是正确答案。

Switching to clang with -O3 , it optimizes both A and E to use a lookup table (like D).使用-O3切换到 clang,它优化了 A 和 E 以使用查找表(如 D)。 It generates equivalent assembly for all options.它为所有选项生成等效的程序集。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM