简体   繁体   English

分支预测:避免简单操作的“else”分支是否会使代码更快(Java 示例)?

[英]Branch prediction: Does avoiding “else” branch for simple operations makes code faster (Java example)?

Options 1:选项 1:

  boolean isFirst = true;
  for (CardType cardType : cardTypes) {
    if (!isFirst) {
      descriptionBuilder.append(" or ");
    } else {
      isFirst = false;
    }
    //other code not relevant to this theoretical question
  }

Option 2:选项 2:

boolean isFirst = true;
for (CardType cardType : cardTypes) {
  if (!isFirst) {
    descriptionBuilder.append(" or ");
  } 
  isFirst = false;
  //other code not relevant to this theoretical question
}

My analysis : Both code has same semantic.我的分析:两个代码具有相同的语义。

1st code) I'm not sure if this code has two branches (in terms of branch predictor) or one branch.第一个代码)我不确定这段代码是有两个分支(就分支预测器而言)还是一个分支。 I was looking into http://en.wikipedia.org/wiki/X86_instruction_listings but couldn't figure out that there is a X86 instruction something like "if previous condition value was false jump there", to avoid two branch predictions (very bad)我正在查看http://en.wikipedia.org/wiki/X86_instruction_listings但无法弄清楚有一个 X86 指令,例如“如果先前的条件值是假的,则跳转到那里”,以避免两个分支预测(非常糟糕) )

2nd code) most likely to always perform simple MOV (to register or element most likely already in the cache), which is relatively inexpensive (few cycles at most)第二个代码)最有可能总是执行简单的 MOV(注册或元素最有可能已经在缓存中),这相对便宜(最多几个周期)

So, my opinion is that unless processor decode into microcode instructions can do something smart or X86 instruction exist to avoid necessary branch predictions, 2nd code is faster.所以,我的观点是,除非处理器解码成微代码指令可以做一些聪明的事情或存在 X86 指令以避免必要的分支预测,否则第二个代码更快。

I understand that this is purely theoretical question, since in practice, this branch can make an application 0.000000002% faster or something like that.我知道这纯粹是理论问题,因为在实践中,这个分支可以使应用程序的速度提高 0.000000002% 或类似的速度。

Did I miss something?我错过了什么?

EDIT: I added a loop for giving more "weight" to branch in question编辑:我添加了一个循环,为有问题的分支提供更多“权重”

EDIT2: The question is about Intel architecture for branch prediction (Pentium and newer processors). EDIT2:问题是关于英特尔架构的分支预测(奔腾和更新的处理器)。

The code has the same effect but won't produce the same byte code or assembly (probably).该代码具有相同的效果,但不会产生相同的字节码或程序集(可能)。

How much difference this makes in terms of performance, is unclear and likely to be trivial.这在性能方面有多大差异尚不清楚,而且可能是微不足道的。

What is far, far more important is the clarity of the code.什么是远远更重要的是代码的清晰度。 I have seen more bugs and performance issues due to code being harder to reason about in simple cases like this.由于在这样的简单情况下代码更难推理,我看到了更多的错误和性能问题。

In short, what is clearest and simplest to you is also likely to be fast enough, or the easiest to fix.简而言之,对您来说最清晰、最简单的方法也可能足够快,或者最容易修复。

Using JMH gives the following numbers with cardTypes array of size 10 and integer increment as the logic (Java 15 / AMD 3950X / Windows 10):使用JMH给出以下数字,其中 cardTypes 数组的大小为 10,整数增量作为逻辑(Java 15 / AMD 3950X / Windows 10):

Benchmark          Mode  Cnt          Score         Error  Units
Benchmark.option1  thrpt   25  273369417.720 ± 1618952.179  ops/s
Benchmark.option2  thrpt   25  273415784.192 ±  852618.585  ops/s

Average performance of "Option 2" is about 0.017% faster (YMMV). “选项 2”的平均性能大约快 0.017% (YMMV)。

See also: branch prediction , method dispatch , memory access , throughput and latency , garbage collection .另请参阅:分支预测方法分派内存访问吞吐量和延迟垃圾收集

Different hardware has different costs for each of the assembler instruction, and on modern hardware even the cost of an instruction is difficult to predict due to the effects of pipelining and caches.不同的硬件对每条汇编指令都有不同的成本,在现代硬件上,由于流水线和缓存的影响,即使是一条指令的成本也很难预测。

The difference between an if and an if/else on pipelining and caches is not clear from your isolated example.从您的孤立示例中并不清楚流水线和缓存上的 if 和 if/else 之间的区别。 If you ran that code once, it is unlikely that you will see any difference at all.如果您运行该代码一次,您就不太可能看到任何差异。 Repeatedly run it in a tight loop, and the performance of the if itself will become dominated by a) the cost of the check and b) the predictability of the result of the check.在紧密循环中重复运行它,if 本身的性能将受到 a) 检查成本和 b) 检查结果的可预测性的支配。 In other words, branch prediction will become the dominating factor and that will not be affected by having an if or an if/else block of code.换句话说,分支预测将成为主导因素,并且不会受到 if 或 if/else 代码块的影响。

An excellent discussion on the effects of branch prediction can be read here Why is it faster to process a sorted array than an unsorted array?关于分支预测效果的精彩讨论可以在这里阅读为什么处理排序数组比处理未排序数组更快? (see the top scoring answer). (见得分最高的答案)。

Assuming that your code snippet is an if block from within a for loop.假设您的代码片段是来自 for 循环内的 if 块。 Hotspot has the ability to unroll for loops, this includes taking the common 'is first iteration of a loop' check and inlining it outside of the loop. Hotspot 能够展开循环,这包括进行常见的“循环的第一次迭代”检查并将其内联到循环之外。 Thus avoiding the costs of rechecking the condition on every iteration of the loop.从而避免在循环的每次迭代中重新检查条件的成本。 Thus avoiding the concern of which is faster, if or if/else.从而避免担心哪个更快,if 或 if/else。

Oracle documents this behavior here Oracle 在此处记录此行为

Both code has same semantic.两个代码具有相同的语义。

No Both code are different,否 两个代码不同,

first code app isFirst = false;第一个代码 app isFirst = false; set flag to false if your condition if (!isFirst) not match.如果您的条件if (!isFirst)不匹配,则将标志设置为 false。

Second code each time change your flag to false even condition satisfied or not.每次将您的标志更改为false即使条件满足与否时的第二个代码。

There are two branches in an if/else construct: the conditional branch at the top, and the branch around the else part at the end of the if part.有在两个分支if/else结构:顶部的条件分支,和周围的分支else在结束部分if一部分。 There are no branches in the else part, at least not in any even moderately competently implemented compiler.else部分没有分支,至少在任何适度执行的编译器中都没有。

Against that you have to balance the cost of always executing the isFirst = false;与此相反,您必须平衡始终执行isFirst = false;的成本isFirst = false; line.线。

In the specific case you mention, it isn't likely to make the slightest difference, compared to the cost of the method call.在您提到的特定情况下,与方法调用的成本相比,它不太可能产生丝毫差异。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM