简体   繁体   English

应该在java字节码中可见乘法/位移优化

[英]Should multiplication/bitshift optimization be visible in java bytecode

I keep reading that bitshifting is not needed, as compiler optimisations will translate a multiplication to a bitshift. 我一直在读取不需要位移,因为编译器优化会将乘法转换为位移。 Such as Should I bit-shift to divide by 2 in Java? 比如我应该在Java中按位移2除以? and Is shifting bits faster than multiplying and dividing in Java? 并且比Java中的乘法和除法更快地移位? .NET? 。净?

I am not inquiring here about the performance difference, I could test that out myself. 我不是在这里询问性能差异,我可以自己测试一下。 But what I think is curious, is that several people mention that it will be "compiled to the same thing". 但我认为很奇怪的是,有几个人提到它会“编译成同样的东西”。 Which seems to be not true. 这似乎不是真的。 I have written a small piece of code. 我写了一小段代码。

private static void multi()
{
    int a = 3;
    int b = a * 2;
    System.out.println(b);
}

private static void shift()
{
    int a = 3;
    int b = a << 1L;
    System.out.println(b);
}

Which gives the same result, and just prints it out. 这给出了相同的结果,并将其打印出来。

When I look at the generated Java Bytecode, the following is shown. 当我查看生成的Java字节码时,会显示以下内容。

private static void multi();
Code:
   0: iconst_3
   1: istore_0
   2: iload_0
   3: iconst_2
   4: imul
   5: istore_1
   6: getstatic     #4                  // Field java/lang/System.out:Ljava/io/PrintStream;
   9: iload_1
  10: invokevirtual #5                  // Method java/io/PrintStream.println:(I)V
  13: return

private static void shift();
Code:
   0: iconst_3
   1: istore_0
   2: iload_0
   3: iconst_1
   4: ishl
   5: istore_1
   6: getstatic     #4                  // Field java/lang/System.out:Ljava/io/PrintStream;
   9: iload_1
  10: invokevirtual #5                  // Method java/io/PrintStream.println:(I)V
  13: return

Now we can see the difference between "imul" and "ishl". 现在我们可以看到“imul”和“ishl”之间的区别。

My question being: clearly the spoken off optimization is not visible in the java bytecode. 我的问题是:显然,在java字节码中看不到口语优化。 I am still assuming the optimization does happen, so does it just happen at a lower level? 我仍然认为优化确实发生了,所以它只是发生在较低的水平吗? Or, alternatively because it is Java, does the JVM when encountering the imul statement somehow know that it should be translated to something else. 或者,或者因为它是Java,JVM在遇到imul语句时会以某种方式知道应该将其转换为其他内容。 If so, any resources on how this is handled would be greatly appreciated. 如果是这样,将非常感谢有关如何处理这些资源的任何资源。

(as a sidenote, I am not trying to justify the need for bitshifting. I think it decreases readability, at least to people used to Java, for C++ that might be different. I am just trying to see where the optimization happens). (作为旁注,我并不是要证明需要进行位移。我认为这会降低可读性,至少对于习惯于Java的人来说,C ++可能会有所不同。我只是想看看优化发生的地方)。

The question in the title sounds a bit different than the question in the text. 标题中的问题听起来与文中的问题略有不同。 The quoted statement that the shift and the multiplication would be "compiled to the same thing" is true. 引用的声明,即移位和乘法将“编译成相同的东西”是真的。 But it does not yet apply to the bytecode. 但它还没有应用于字节码。

In general, the Java bytecode is rather un-optimized. 通常,Java字节码相当未优化。 There are very few optimizations done at all - mainly inlining of constants. 还有,在全部完成极少的优化-主要是内联常量。 Beyond that, the Java bytecode is just an intermediate representation of the original program. 除此之外,Java字节码只是原始程序的中间表示。 And the translation from Java to Java bytecode is done rather "literally". 从Java到Java字节码的转换相当“按字面意思”完成。

(Which, I think, is a good thing. The bytecode still closely resembles the original Java code. All the nitty-gritty (platform specific!) optimizations that may be possible are left to the virtual machine, which has far more options here. (我认为这是一件好事。字节码仍然非常类似于原始的Java代码。所有可能的细节(特定于平台!)优化都留给虚拟机,这里有更多的选择。

All further optimizations, like arithmetic optimizations, dead code elimination or method inlining, are done by the JIT (Just-In-Time-Compiler), at runtime. 所有进一步的优化,如算术优化,死代码消除或方法内联,都是由JIT(即时编译器)在运行时完成的。 The Just-In-Time compiler also applies the optimization of replacing the multiplication by a bit shift. Just-In-Time编译器还应用了通过位移替换乘法的优化。

The example that you gave makes it a bit difficult to show the effect, for several reasons. 由于几个原因,您给出的示例使得显示效果有点困难。 The fact that the System.out.println was included in the method tends to make the actual machine code large, due to inlining and the general prerequisites for calling this method. 由于内联和调用此方法的一般先决条件, System.out.println包含在方法中的事实往往会使实际的机器代码变大。 But more importantly, the shift by 1, which corresponds to a multiplication with 2, also corresponds to an addition of the value to itself. 但更重要的是,移位1(对应于乘以2)也对应于将值加到自身上。 So instead of observing a shl (left-shift) assembler instruction in the resulting machine code for the multi method, you'd likely see a disguised add instruction in the multi - and the shift method. 因此,而不是观察的shl (左移)在对所产生的机器代码汇编指令multi方法,你很可能看到一种变相add在指令multi -和shift法。

However, here is a very pragmatic example that does a left-shift by 8, corresponding to a multiplication with 256: 然而,这是一个非常实用的例子,它左移8,对应于256的乘法:

class BitShiftOptimization
{
    public static void main(String args[])
    {
        int blackHole = 0;
        for (int i=0; i<1000000; i++)
        {
            blackHole += testMulti(i);
            blackHole += testShift(i);
        }
        System.out.println(blackHole);

    }

    public static int testMulti(int a)
    {
        int b = a * 256;
        return b;
    }

    public static int testShift(int a)
    {
        int b = a << 8L;
        return b;
    }
}

(It receives the value to be shifted as an argument, to prevent it from being optimized away into a constant. It calls the methods several times, to trigger the JIT. And it returns and collects the values from both methods to prevent the method calls to be optimized away. Again, this is very pragmatic, but sufficient to show the effect) (它接收要作为参数移位的值,以防止它被优化为常量。它多次调用方法,以触发JIT。它返回并从两个方法中收集值以防止方法调用再次,这是非常务实的,但足以显示效果)

Running this in a Hotspot Disassembler VM with 在Hotspot Disassembler VM中运行它

java -server -XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation -XX:+PrintInlining -XX:+PrintAssembly BitShiftOptimization

will produce the following assembler code for the testMulti method: 将为testMulti方法生成以下汇编程序代码:

Decoding compiled method 0x000000000286fbd0:
Code:
[Entry Point]
[Verified Entry Point]
[Constants]
  # {method} {0x000000001c0003b0} &apos;testMulti&apos; &apos;(I)I&apos; in &apos;BitShiftOptimization&apos;
  # parm0:    rdx       = int
  #           [sp+0x40]  (sp of caller)
  0x000000000286fd20: mov    %eax,-0x6000(%rsp)
  0x000000000286fd27: push   %rbp
  0x000000000286fd28: sub    $0x30,%rsp
  0x000000000286fd2c: movabs $0x1c0005a8,%rax   ;   {metadata(method data for {method} {0x000000001c0003b0} &apos;testMulti&apos; &apos;(I)I&apos; in &apos;BitShiftOptimization&apos;)}
  0x000000000286fd36: mov    0xdc(%rax),%esi
  0x000000000286fd3c: add    $0x8,%esi
  0x000000000286fd3f: mov    %esi,0xdc(%rax)
  0x000000000286fd45: movabs $0x1c0003a8,%rax   ;   {metadata({method} {0x000000001c0003b0} &apos;testMulti&apos; &apos;(I)I&apos; in &apos;BitShiftOptimization&apos;)}
  0x000000000286fd4f: and    $0x1ff8,%esi
  0x000000000286fd55: cmp    $0x0,%esi
  0x000000000286fd58: je     0x000000000286fd70  ;*iload_0
                        ; - BitShiftOptimization::testMulti@0 (line 17)

  0x000000000286fd5e: shl    $0x8,%edx
  0x000000000286fd61: mov    %rdx,%rax
  0x000000000286fd64: add    $0x30,%rsp
  0x000000000286fd68: pop    %rbp
  0x000000000286fd69: test   %eax,-0x273fc6f(%rip)        # 0x0000000000130100
                        ;   {poll_return}
  0x000000000286fd6f: retq   
  0x000000000286fd70: mov    %rax,0x8(%rsp)
  0x000000000286fd75: movq   $0xffffffffffffffff,(%rsp)
  0x000000000286fd7d: callq  0x000000000285f160  ; OopMap{off=98}
                        ;*synchronization entry
                        ; - BitShiftOptimization::testMulti@-1 (line 17)
                        ;   {runtime_call}
  0x000000000286fd82: jmp    0x000000000286fd5e
  0x000000000286fd84: nop
  0x000000000286fd85: nop
  0x000000000286fd86: mov    0x2a8(%r15),%rax
  0x000000000286fd8d: movabs $0x0,%r10
  0x000000000286fd97: mov    %r10,0x2a8(%r15)
  0x000000000286fd9e: movabs $0x0,%r10
  0x000000000286fda8: mov    %r10,0x2b0(%r15)
  0x000000000286fdaf: add    $0x30,%rsp
  0x000000000286fdb3: pop    %rbp
  0x000000000286fdb4: jmpq   0x0000000002859420  ;   {runtime_call}
  0x000000000286fdb9: hlt    
  0x000000000286fdba: hlt    
  0x000000000286fdbb: hlt    
  0x000000000286fdbc: hlt    
  0x000000000286fdbd: hlt    
  0x000000000286fdbe: hlt    

(the code for the testShift method has the same instructions, by the way). testShift方法的代码具有相同的指令)。

The relevant line here is 这里的相关行是

  0x000000000286fd5e: shl    $0x8,%edx

which corresponds to the left-shift by 8. 这对应于左移8。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM