java中循環的性能（vs不帶bitshift，for while while）

Question

我剛剛用Java中的循環進行了一些測試。 我假設Java中的位移速度通常比默認的整數增量快。 所以這是我的示例代碼：

final int n = 16;
long n1 = System.nanoTime();
for (int i = 1; i < 1 << n; i <<= 1) {
    // nothing
}
long n2 = System.nanoTime();
for (int i = 0; i < n; i++) {
    // nothing
}
long n3 = System.nanoTime();
System.out.println("with shift = " + (n2 - n1) + " ns");
System.out.println("without shift = " + (n3 - n2) + " ns");

所以我的想法是，n1和n2之間的時間小於n2和n3之間的時間。 但是每次運行此代碼段時，整數增量似乎都會更快。 以下是上述代碼的輸出：

with shift = 2445 ns
without shift = 1885 ns

with shift = 2374 ns
without shift = 1886 ns

with shift = 2374 ns
without shift = 1607 ns

有人可以解釋這個beahviour嗎？ 答案是JVM如何編譯此代碼的方式，還是基於底層架構？

Ubuntu Linux 3.5.0-17-generic i686 GNU/Linux
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Pentium(R) Dual-Core CPU       T4300  @ 2.10GHz
stepping    : 10
microcode   : 0xa07
cpu MHz     : 1200.000
cache size  : 1024 KB
physical id : 0
siblings    : 2
core id     : 0
cpu cores   : 2
apicid      : 0
initial apicid  : 0
fdiv_bug    : no
hlt_bug     : no
f00f_bug    : no
coma_bug    : no
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm xsave lahf_lm dtherm
bogomips    : 4189.42
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Pentium(R) Dual-Core CPU       T4300  @ 2.10GHz
stepping    : 10
microcode   : 0xa07
cpu MHz     : 1200.000
cache size  : 1024 KB
physical id : 0
siblings    : 2
core id     : 1
cpu cores   : 2
apicid      : 1
initial apicid  : 1
fdiv_bug    : no
hlt_bug     : no
f00f_bug    : no
coma_bug    : no
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm xsave lahf_lm dtherm
bogomips    : 4189.42
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

==========編輯===============

好的，所以我更新了我的代碼以獲得更好的測量。

我的JVM：

java version "1.6.0_37"
Java(TM) SE Runtime Environment (build 1.6.0_37-b06)
Java HotSpot(TM) Server VM (build 20.12-b01, mixed mode)

新代碼：

// amount of shifts
final int n = 16;
// recorded times
long n1 = 0, n2 = 0, n3 = 0, n4 = 0, n5 = 0;
// measured times
long withShiftFor = Long.MAX_VALUE;
long withoutShiftFor = Long.MAX_VALUE;
long withShiftWhile = Long.MAX_VALUE;
long withoutShiftWhile = Long.MAX_VALUE;
// instance to operate with
boolean b = true;
// do some loops to measure a better result
for (int x = 0; x < 2000000; x++) {
    // for loop with shift
    n1 = System.nanoTime();
    for (int i = 1; i < 1 << n; i <<= 1) {
        b = !b;
    }
    // for loop wihtout shift
    n2 = System.nanoTime();
    for (int i = 0; i < n; i++) {
        b = !b;
    }
    // while loop with shift
    n3 = System.nanoTime();
    int i = 1;
    while (i < 1 << n) {
        b = !b;
        i <<= 1;
    }
    // while loop without shift
    n4 = System.nanoTime();
    int j = 0;
    while (j < n) {
        b = !b;
        j++;
    }
    n5 = System.nanoTime();
    // take minimal time to save best result
    withShiftFor = Math.min(withShiftFor, n2 - n1);
    withoutShiftFor = Math.min(withoutShiftFor, n3 - n2);
    withShiftWhile = Math.min(withShiftWhile, n4 - n3);
    withoutShiftWhile = Math.min(withoutShiftWhile, n5 - n4);
}
System.out.println("for with shift = " + withShiftFor + " ns");
System.out.println("for without shift = " + withoutShiftFor + " ns");
System.out.println("while with shift = " + withShiftWhile + " ns");
System.out.println("while without shift = " + withoutShiftWhile + " ns");

3次運行后的新輸出（每次運行超過5秒）：

for with shift = 907 ns
for without shift = 838 ns
while with shift = 907 ns
while without shift = 907 ns

for with shift = 907 ns
for without shift = 907 ns
while with shift = 907 ns
while without shift = 907 ns

for with shift = 907 ns
for without shift = 838 ns
while with shift = 907 ns
while without shift = 907 ns

所以你是對的，經過幾秒鍾和很多次迭代后，結果幾乎相同。 但是為什么for循環沒有比其他解決方案更快地移動？ 有沒有通過jvm dispite一行優化增加而不是4行通過你提到的變換？ 為什么while的增量和其他循環一樣快？

Answer 1

有人可以解釋這個beahviour嗎？ 答案是JVM如何編譯此代碼的方式，還是基於底層架構？

運行短循環時，將解釋代碼。 因此，如果您不打算經常運行代碼或者您無法預熱代碼，那么您應該對此進行基准測試，並期望得到類似您所擁有的奇怪結果。

如果你想比較編譯/優化的代碼，你應該忽略前10K到20K的循環，因為循環需要迭代10K的時間來修改它以便默認編譯（然后在后台編譯需要一點時間）

無論如何，我還建議運行測試至少2秒以減少變化。

你的循環沒有做任何事情，我希望JIT消除它們，你最終只計算完成System.nanoTime（）所需的時間，可以根據系統增加40-1000 ns。

Answer 2

移位數字需要4個字節碼，而遞增只需要1. JIT編譯器可能會改變，因為Peter Lawrey說。

java中循環的性能（vs不帶bitshift，for while while）

問題描述

2 個解決方案

解決方案1
2 已采納 2012-11-29 11:28:20

解決方案2
1 2012-11-29 11:42:32

java中循環的性能（vs不帶bitshift，for while while）

問題描述

2 個解決方案

解決方案1 2 已采納 2012-11-29 11:28:20

解決方案2 1 2012-11-29 11:42:32

解決方案1
2 已采納 2012-11-29 11:28:20

解決方案2
1 2012-11-29 11:42:32