简体   繁体   English

Java编译器是否优化了不必要的三元运算符?

[英]Does the Java compiler optimize an unnecessary ternary operator?

I've been reviewing code where some coders have been using redundant ternary operators “for readability.” Such as: 我一直在审查代码,其中一些编码器一直在使用冗余三元运算符“以提高可读性。”例如:

boolean val = (foo == bar && foo1 != bar) ? true : false;

Obviously it would be better to just assign the statement's result to the boolean variable, but does the compiler care? 显然,将语句的结果分配给boolean变量会更好,但编译器是否关心?

I find that unnecessary usage of the ternary operator tends to make the code more confusing and less readable , contrary to the original intention. 我发现三元运算符的不必要使用倾向于使代码更加混乱和可读性更低 ,这与初始意图相反。

That being said, the compiler's behaviour in this regard can easily be tested by comparing the bytecode as compiled by the JVM. 话虽这么说,编译器在这方面的行为很容易通过比较JVM编译的字节码来测试。
Here are two mock classes to illustrate this: 这里有两个模拟类来说明这一点:

Case I (without the ternary operator): 案例I(没有三元运算符):

class Class {

    public static void foo(int a, int b, int c) {
        boolean val = (a == c && b != c);
        System.out.println(val);
    }

    public static void main(String[] args) {
       foo(1,2,3);
    }
}

Case II (with the ternary operator): 案例II(使用三元运算符):

class Class {

    public static void foo(int a, int b, int c) {
        boolean val = (a == c && b != c) ? true : false;
        System.out.println(val);
    }

    public static void main(String[] args) {
       foo(1,2,3);
    }
}

Bytecode for foo() method in Case I: 案例I中用于foo()方法的字节码:

       0: iload_0
       1: iload_2
       2: if_icmpne     14
       5: iload_1
       6: iload_2
       7: if_icmpeq     14
      10: iconst_1
      11: goto          15
      14: iconst_0
      15: istore_3
      16: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
      19: iload_3
      20: invokevirtual #3                  // Method java/io/PrintStream.println:(Z)V
      23: return

Bytecode for foo() method in Case II: 案例II中的foo()方法的字节码:

       0: iload_0
       1: iload_2
       2: if_icmpne     14
       5: iload_1
       6: iload_2
       7: if_icmpeq     14
      10: iconst_1
      11: goto          15
      14: iconst_0
      15: istore_3
      16: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
      19: iload_3
      20: invokevirtual #3                  // Method java/io/PrintStream.println:(Z)V
      23: return

Note that in both cases the bytecode is identical, ie the compiler disregards the ternary operator when compiling the value of the val boolean. 请注意,在这两种情况下,字节码都是相同的,即编译器在编译val boolean的值时忽略了三元运算符。


EDIT: 编辑:

The conversation regarding this question has gone one of several directions. 关于这个问题的谈话已成为几个方向之一。
As shown above, in both cases (with or without the redundant ternary) the compiled java bytecode is identical . 如上所示,在两种情况下(有或没有冗余三元组) ,编译的java字节码是相同的
Whether this can be regarded an optimization by the Java compiler depends somewhat on your definition of optimization. 是否可以认为Java编译器的优化在某种程度上取决于您对优化的定义。 In some respects, as pointed out multiple times in other answers, it makes sense to argue that no - it isn't an optimization so much as it is the fact that in both cases the generated bytecode is the simplest set of stack operations that performs this task, regardless of the ternary. 在某些方面,正如在其他答案中多次指出的那样,有理由认为不 - 它不是一个优化,因为在这两种情况下生成的字节码都是执行的最简单的堆栈操作集这项任务,无论三元。

However regarding the main question: 但是关于主要问题:

Obviously it would be better to just assign the statement's result to the boolean variable, but does the compiler care? 显然,将语句的结果分配给布尔变量会更好,但编译器是否关心?

The simple answer is no. 简单回答是不。 The compiler doesn't care. 编译器不关心。

Contrary to the answers of Pavel Horal , Codo and yuvgin I argue that the compiler does NOT optimize away (or disregard) the ternary operator . Pavel HoralCodoyuvgin的答案相反,我认为编译器不会优化(或忽略)三元运算符 (Clarification: I refer to the Java to Bytecode compiler, not the JIT) (澄清:我指的是Java to Bytecode编译器,而不是JIT)

See the test cases. 查看测试用例。

Class 1 : Evaluate boolean expression, store it in a variable, and return that variable. 第1类 :计算布尔表达式,将其存储在变量中,然后返回该变量。

public static boolean testCompiler(final int a, final int b)
{
    final boolean c = ...;
    return c;
}

So, for different boolean expressions we inspect the bytecode: 1. Expression: a == b 因此,对于不同的布尔表达式,我们检查字节码:1。表达式: a == b

Bytecode 字节码

   0: iload_0
   1: iload_1
   2: if_icmpne     9
   5: iconst_1
   6: goto          10
   9: iconst_0
  10: istore_2
  11: iload_2
  12: ireturn
  1. Expression: a == b ? true : false 表达式: a == b ? true : false a == b ? true : false

Bytecode 字节码

   0: iload_0
   1: iload_1
   2: if_icmpne     9
   5: iconst_1
   6: goto          10
   9: iconst_0
  10: istore_2
  11: iload_2
  12: ireturn
  1. Expression: a == b ? false : true 表达式: a == b ? false : true a == b ? false : true

Bytecode 字节码

   0: iload_0
   1: iload_1
   2: if_icmpne     9
   5: iconst_0
   6: goto          10
   9: iconst_1
  10: istore_2
  11: iload_2
  12: ireturn

Cases (1) and (2) compile to exactly the same bytecode, not because the compiler optimizes away the ternary operator, but because it essentially needs to execute that trivial ternary operator every time. 情况(1)和(2)编译为完全相同的字节码,不是因为编译器优化了三元运算符,而是因为它基本上每次都需要执行那个简单的三元运算符。 It needs to specify at bytecode level whether to return true or false. 它需要在字节码级别指定是返回true还是false。 To verify that, look at case (3). 要验证这一点,请查看案例(3)。 It is exactly the same bytecode except lines 5 and 9 which are swapped. 除了第5行和第9行之外,它是完全相同的字节码。

What happens then and a == b ? true : false 然后会发生什么, a == b ? true : false a == b ? true : false when decompiled produces a == b ? a == b ? true : false反编译时为a == b ? true : false会产生a == b It is the decompiler's choice that selects the easiest path. 反编译器的选择是选择最简单的路径。

Furthermore, based on the "Class 1" experiment, it is reasonable to assume that a == b ? true : false 此外,根据“1级”实验,假设a == b ? true : false是合理的a == b ? true : false a == b ? true : false is exactly the same as a == b , in the way it is translated to bytecode. a == b ? true : falsea == b完全相同,就像它被转换为字节码一样。 However this is not true. 但事实并非如此。 To test that we examine the following "Class 2", the only difference with the "Class 1" being that this doesn't store the boolean result in a variable but instead immediately returns it. 为了测试我们检查下面的“Class 2”,与“Class 1”的唯一区别是,它不会将boolean结果存储在变量中,而是立即返回它。

Class 2 : Evaluate a boolean expression and return the result (without storing it in a variable) 第2类 :计算布尔表达式并返回结果(不将其存储在变量中)

public static boolean testCompiler(final int a, final int b)
{
    return ...;
}
    1. a == b

Bytecode: 字节码:

   0: iload_0
   1: iload_1
   2: if_icmpne     7
   5: iconst_1
   6: ireturn
   7: iconst_0
   8: ireturn
    1. a == b ? true : false

Bytecode 字节码

   0: iload_0
   1: iload_1
   2: if_icmpne     9
   5: iconst_1
   6: goto          10
   9: iconst_0
  10: ireturn
    1. a == b ? false : true

Bytecode 字节码

   0: iload_0
   1: iload_1
   2: if_icmpne     9
   5: iconst_0
   6: goto          10
   9: iconst_1
  10: ireturn

Here it is obvious that the a == b and a == b ? true : false 这显然是a == b a == b ? true : false a == b ? true : false expressions are compiled differently , as cases (1) and (2) produce different bytecodes (cases (2) and (3), as expected, have only their lines 5,9 swapped). a == b ? true : false 表达式的编译方式不同 ,因为case(1)和(2)产生不同的字节码(case(2)和(3),正如预期的那样,只有它们的行5,9交换)。

At first I found this surprising, as I was expecting all 3 cases to be the same (excluding the swapped lines 5,9 of case (3)). 起初我发现这很令人惊讶,因为我预计所有3个案例都是相同的(不包括案例(3)的交换行5,9)。 When the compiler encounters a == b , it evaluates the expression and returns immediately after contrary to the encounter of a == b ? true : false 当编译器遇到a == b ,它会计算表达式并在遇到a == b ? true : false后立即返回a == b ? true : false a == b ? true : false where it uses the goto to go to line ireturn . a == b ? true : false ,它使用goto转到line ireturn I understand that this is done to leave space for potential statements to be evaluated inside the 'true' case of the ternary operator: between the if_icmpne check and the goto line. 我知道这样做是为了在三元运算符的'true'情况下为潜在语句留出空间:在if_icmpne检查和goto行之间。 Even if in this case it is just a boolean true , the compiler handles it as it would in the general case where a more complex block would be present . 即使在这种情况下它只是一个布尔值true编译器也会像在存在更复杂块的一般情况下那样处理它
On the other hand, the "Class 1" experiment obscured that fact, as in the true branch there was also istore , iload and not only ireturn forcing a goto command and resulting in exactly the same bytecode in cases (1) and (2). 在另一方面,在“类1”实验遮蔽这一事实,如在true分支也有istoreiload不仅ireturn迫使goto命令并导致准确的情况下相同的字节码(1)和(2) 。

As a note regarding the test environment, these bytecodes were produced with the latest Eclipse (4.10) which uses the respective ECJ compiler, different from the javac that IntelliJ IDEA uses. 作为关于测试环境的注释,这些字节码是使用最新的Eclipse(4.10)生成的,它使用相应的ECJ编译器,与IntelliJ IDEA使用的javac不同。

However, reading the javac-produced bytecode in the other answers (which are using IntelliJ) I believe the same logic applies there too, at least for the "Class 1" experiment where the value was stored and not returned immediately. 但是,在其他答案(使用IntelliJ)中读取javac生成的字节码我相信同样的逻辑也适用于此,至少对于存储值而不立即返回的“Class 1”实验。

Finally, as already pointed out in other answers (such as those by supercat and jcsahnwaldt ) , both in this thread and in other questions of SO, the heavy optimizing is done by the JIT compiler and not from the java-->java-bytecode compiler, so these inspections while informative to the bytecode translation are not a good measure of how the final optimized code will execute. 最后,正如在其他答案(例如supercatjcsahnwaldt )中已经指出的那样,在这个线程和SO的其他问题中,重度优化都是由JIT编译器完成的,而不是来自java - > java-bytecode编译器,因此这些检查虽然对字节码转换提供信息,但不能很好地衡量最终优化代码的执行方式。

Complement: jcsahnwaldt 's answer compares javac's and ECJ's produced bytecode for similar cases 补充: jcsahnwaldt的回答比较了javac和ECJ为类似案例生成的字节码

(As a disclaimer, I have not studied the Java compiling or disassembly that much to actually know what it does under the hood; my conclusions are mainly based on the results of the above experiments.) (作为免责声明,我还没有研究过Java编译或反汇编,实际上知道它在幕后做了什么;我的结论主要是基于上述实验的结果。)

Yes, the Java compiler does optimize. 是的,Java编译器确实进行了优化。 It can be easily verified: 它可以很容易地验证:

public class Main1 {
  public static boolean test(int foo, int bar, int baz) {
    return foo == bar && bar == baz ? true : false;
  }
}

After javac Main1.java and javap -c Main1 : javac Main1.javajavap -c Main1

  public static boolean test(int, int, int);
    Code:
       0: iload_0
       1: iload_1
       2: if_icmpne     14
       5: iload_1
       6: iload_2
       7: if_icmpne     14
      10: iconst_1
      11: goto          15
      14: iconst_0
      15: ireturn

public class Main2 {
  public static boolean test(int foo, int bar, int baz) {
    return foo == bar && bar == baz;
  }
}

After javac Main2.java and javap -c Main2 : javac Main2.javajavap -c Main2

  public static boolean test(int, int, int);
    Code:
       0: iload_0
       1: iload_1
       2: if_icmpne     14
       5: iload_1
       6: iload_2
       7: if_icmpne     14
      10: iconst_1
      11: goto          15
      14: iconst_0
      15: ireturn

Both examples end up with exactly the same bytecode. 两个示例都以完全相同的字节码结束。

The javac compiler does not generally attempt to optimize code before outputting bytecode. javac编译器通常不会在输出字节码之前尝试优化代码。 Instead, it relies upon the Java virtual machine (JVM) and just-in-time (JIT) compiler that converts the bytecode to machine code to situations where a construct would be equivalent to a simpler one. 相反,它依赖于Java虚拟机(JVM)和实时(JIT)编译器,它将字节码转换为机器代码,而构造将等同于更简单的构造。

This makes it much easier to determine if an implementation of a Java compiler is working correctly, since most constructs can only be represented by one predefined sequence of bytecodes. 这使得确定Java编译器的实现是否正常工作变得更加容易,因为大多数构造只能由一个预定义的字节码序列表示。 If a compiler produces any other bytecode sequence, it is broken, even if that sequence would behave in the same fashion as the original . 如果编译器产生任何其他字节码序列,则它会被破坏, 即使该序列的行为与原始序列的行为相同

Examining the bytecode output of the javac compiler is not a good way of judging whether a construct is likely to execute efficiently or inefficiently. 检查javac编译器的字节码输出不是判断构造是否可能高效执行或低效执行的好方法。 It would seem likely that there may be some JVM implementation where constructs like (someCondition ? true : false) would perform worse than (someCondition) , and some where they would perform identically. 似乎可能有一些JVM实现,其中像(someCondition ? true : false)这样的结构会比(someCondition)表现更差,而某些地方它们的执行方式相同。

In IntelliJ, I've compiled your code and the opened the class file, which is automatically decompiled. 在IntelliJ中,我编译了你的代码并打开了自动反编译的类文件。 The result is: 结果是:

boolean val = foo == bar && foo1 != bar;

So yes, the Java compiler optimizes it. 所以,是的,Java编译器优化它。

I'd like to synthesize the excellent information given in the previous answers. 我想综合以前答案中给出的优秀信息。

Let's look at what Oracle's javac and Eclipse's ecj do with the following code: 让我们看看Oracle的javac和Eclipse的ecj使用以下代码做什么:

boolean  valReturn(int a, int b) { return a == b; }
boolean condReturn(int a, int b) { return a == b ? true : false; }
boolean   ifReturn(int a, int b) { if (a == b) return true; else return false; }

void  valVar(int a, int b) { boolean c = a == b; }
void condVar(int a, int b) { boolean c = a == b ? true : false; }
void   ifVar(int a, int b) { boolean c; if (a == b) c = true; else c = false; }

(I simplified your code a bit - one comparison instead of two - but the behavior of the compilers described below is essentially the same, including their slightly different results.) (我简化了你的代码 - 一个比较而不是两个 - 但下面描述的编译器的行为基本相同,包括它们略有不同的结果。)

I compiled the code with javac and ecj and then decompiled it with Oracle's javap. 我用javac和ecj编译代码然后用Oracle的javap反编译它。

Here's the result for javac (I tried javac 9.0.4 and 11.0.2 - they generate exactly the same code): 这是javac的结果(我试过javac 9.0.4和11.0.2 - 它们生成完全相同的代码):

boolean valReturn(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     9
     5: iconst_1
     6: goto          10
     9: iconst_0
    10: ireturn

boolean condReturn(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     9
     5: iconst_1
     6: goto          10
     9: iconst_0
    10: ireturn

boolean ifReturn(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     7
     5: iconst_1
     6: ireturn
     7: iconst_0
     8: ireturn

void valVar(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     9
     5: iconst_1
     6: goto          10
     9: iconst_0
    10: istore_3
    11: return

void condVar(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     9
     5: iconst_1
     6: goto          10
     9: iconst_0
    10: istore_3
    11: return

void ifVar(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     10
     5: iconst_1
     6: istore_3
     7: goto          12
    10: iconst_0
    11: istore_3
    12: return

And here's the result for ecj (version 3.16.0): 这是ecj(版本3.16.0)的结果:

boolean valReturn(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     7
     5: iconst_1
     6: ireturn
     7: iconst_0
     8: ireturn

boolean condReturn(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     9
     5: iconst_1
     6: goto          10
     9: iconst_0
    10: ireturn

boolean ifReturn(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     7
     5: iconst_1
     6: ireturn
     7: iconst_0
     8: ireturn

void valVar(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     9
     5: iconst_1
     6: goto          10
     9: iconst_0
    10: istore_3
    11: return

void condVar(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     9
     5: iconst_1
     6: goto          10
     9: iconst_0
    10: istore_3
    11: return

void ifVar(int, int);
  Code:
     0: iload_1
     1: iload_2
     2: if_icmpne     10
     5: iconst_1
     6: istore_3
     7: goto          12
    10: iconst_0
    11: istore_3
    12: return

For five of the six functions, both compilers generate exactly the same code. 对于六个函数中的五个,两个编译器生成完全相同的代码。 The only difference is in valReturn : javac generates a goto to an ireturn , but ecj generates an ireturn . 唯一的区别是在valReturn :javac的产生gotoireturn ,但ECJ生成ireturn For condReturn , they both generate a goto to an ireturn . 对于condReturn ,他们都生成一个gotoireturn For ifReturn , they both generate an ireturn . 对于ifReturn ,他们都产生了一个ireturn

Does that mean that one of the compilers optimizes one or more of these cases? 这是否意味着其中一个编译器会优化其中一个或多个案例? One might think that javac optimizes the ifReturn code, but fails to optimize valReturn and condReturn , while ecj optimizes ifReturn and and valReturn , but fails to optimize condReturn . 有人可能认为javac优化了ifReturn代码,但未能优化valReturncondReturn ,而ecj优化了ifReturnvalReturn ,但未能优化condReturn

But I don't think that's true. 但我认为这不是真的。 Java source code compilers basically don't optimize code at all. Java源代码编译器基本上根本不优化代码。 The compiler that does optimize the code is the JIT (just-in-time) compiler (the part of the JVM that compiles byte code to machine code), and the JIT compiler can do a better job if the byte code is relatively simple, ie has not been optimized. 优化代码的编译器是JIT(刚刚在时间)编译器(即编译字节码到机器代码的JVM的一部分),如果字节码是比较简单的JIT编译器可以做得更好,即没有优化。

In a nutshell: No, Java source code compilers do not optimize this case, because they don't really optimize anything. 简而言之:不,Java源代码编译器没有优化这种情况,因为它们并没有真正优化任何东西。 They do what the specifications require them to do, but nothing more. 他们按照规范要求他们去做,但仅此而已。 The javac and ecj developers simply chose slightly different code generation strategies for these cases (presumably for more or less arbitrary reasons). javac和ecj开发人员只是为这些情况选择了稍微不同的代码生成策略(可能是出于或多或少的任意原因)。

See these Stack Overflow questions for a few more details. 有关更多详细信息,请参阅这些 Stack Overflow 问题

(Case in point: both compilers nowadays ignore the -O flag. The ecj options explicitly say so: -O: optimize for execution time (ignored) . javac doesn't even mention the flag anymore and just ignores it.) (例证:现在两个编译器都忽略-O标志。ecj选项明确地这样说: -O: optimize for execution time (ignored) .javac甚至不再提及标志而只是忽略它。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM