简体   繁体   English

静态最终场,静态场和性能

[英]static final field, static field and performances

Even thought it's not its main purpose, I've always thought that the final keyword (in some situations and VM implementations) could help the JIT. 即使认为这不是它的主要目的,我一直认为final关键字(在某些情况下和VM实现中)可以帮助JIT。
It might be an urban legend but I've never imagined that setting a field final could negatively affect the performances. 这可能是一个都市传奇,但我从未想过,设置场地final可能会对表现产生负面影响。

Until I ran into some code like that: 直到我碰到这样的代码:

   private static final int THRESHOLD = 10_000_000;
   private static int [] myArray = new int [THRESHOLD];

   public static void main(String... args) {
      final long begin = System.currentTimeMillis();

      //Playing with myArray
      int index1,index2;
      for(index1 = THRESHOLD - 1; index1 > 1; index1--)
          myArray[index1] = 42;             //Array initial data
      for(index1 = THRESHOLD - 1; index1 > 1; index1--) {
                                            //Filling the array
          for(index2 = index1 << 1; index2 < THRESHOLD; index2 += index1)
              myArray[index2] += 32;
      }

      long result = 0;
      for(index1 = THRESHOLD - 1; index1 > 1; index1-=100)
          result += myArray[index1];

      //Stop playing, let's see how long it took
      System.out.println(result);
      System.out.println((System.currentTimeMillis())-begin+"ms");
   }


Let's have a look at: private static int [] myArray = new int [THRESHOLD]; 我们来看看: private static int [] myArray = new int [THRESHOLD];
Under W7 64-bit and on a basis of 10 successive runs, I get the following results: 在W7 64位下,基于10次连续运行,我得到以下结果:

  1. THRESHOLD = 10^7 , 1.7.0u09 client VM (Oracle): THRESHOLD = 10^7 7,1.7.0u09客户端VM(Oracle):

    • runs in ~2133ms when myArray is not final. myArray不是最终版时,运行时间约为2133毫秒。
    • runs in ~2287ms when myArray is final. myArray最终时,在~2287ms运行。
    • The -server VM produces similar figures ie 2131ms and 2284ms. -server VM产生类似的数字,即2131ms和2284ms。

  2. THRESHOLD = 3x10^7 , 1.7.0u09 client VM (Oracle): THRESHOLD = 3x10^7 7,1.7.0u09客户端VM(Oracle):

    • runs in ~7647ms when myArray is not final. myArray不是最终版时,运行在~7647ms。
    • runs in ~8190ms when myArray is final. myArray最终时,在~8190ms运行。
    • The -server VM produces ~7653ms and ~8150ms. -server VM产生~7653ms和~8150ms。

  3. THRESHOLD = 3x10^7 , 1.7.0u01 client VM (Oracle): THRESHOLD = 3x10^7 7,1.7.0u01客户端VM(Oracle):

    • runs in ~8166ms when myArray is not final. myArray不是最终版时,运行时间约为8166ms。
    • runs in ~9694ms when myArray is final. myArray是最终版时,运行时间约为9694ms。 That's more than 15% difference ! 这差异超过15%!
    • The -server VM produces a neglectable difference in favour of the non-final version, about 1%. -server VM产生了可忽略的差异,有利于非最终版本,大约1%。

Remark: I used the bytecode produced by JDK 1.7.0u09's javac for all my tests. 备注:我使用JDK 1.7.0u09的javac生成的字节码进行所有测试。 The bytecode produced is exactly the same for both versions except for myArray declaration, that was expected. 除了myArray声明之外,两个版本生成的字节码完全相同,这是预期的。

So why is the version with a static final myArray slower than the one with static myArray ? 那么为什么带有static final myArray的版本比带有static myArray的版本慢?


EDIT (using Aubin's version of my snippet): 编辑(使用Aubin的我的代码片段版本):

It appears that the differences between the version with final keyword and the one without only lies in the first iteration. 似乎版本与final关键字之间的差异与不仅仅在第一次迭代中的差异。 Somehow, the version with final is always slower than its counterpart without on the first iteration, then next iterations have similar timings. 不知何故,具有final的版本总是比没有第一次迭代时的版本慢,然后下一次迭代具有相似的时序。

For example, with THRESHOLD = 10^8 and running with 1.7.0u09 client the first computation takes approx 35s while the second 'only' takes 30s. 例如,使用THRESHOLD = 10^8并使用1.7.0u09客户端运行时,第一次计算需要大约35秒,而第二次“仅”需要30秒。

Obviously the VM performed an optimization, was that the JIT in action and why didn't it kick earlier (for example by compiling the second level of the nested loop, this part was the hotspot) ? 显然VM执行了一个优化,是JIT在运行,为什么它没有提前启动(例如通过编译嵌套循环的第二级,这部分是热点)?

Note that my remarks are still valid with 1.7.0u01 client VM. 请注意,我的备注对1.7.0u01客户端VM仍然有效。 With that very version (and maybe earlier releases), the code with final myArray runs slower than the one without this keyword: 2671ms vs 2331ms on a basis of 200 iterations. 对于那个版本 (也许是早期版本), 带有final myArray的代码比没有此关键字的代码运行得慢:2671ms vs 2331ms,基于200次迭代。

IMHO, the time of the System.out.println( result ) should not be added because I/O are highly variables and time consuming. 恕我直言,不应添加System.out.println(结果)的时间,因为I / O是高度变量和耗时的。

I think the factor of println() influence is bigger, really bigger than final influence. 我认为println()影响因素更大,比最终影响更大。

I propose to write the performance test as follow: 我建议编写性能测试如下:

public class Perf {
   private static final int   THRESHOLD = 10_000_000;
   private static final int[] myArray   = new int[THRESHOLD];
   private static /* */ long  min = Integer.MAX_VALUE;
   private static /* */ long  max = 0L;
   private static /* */ long  sum = 0L;

   private static void perf( int iteration ) {
      final long begin = System.currentTimeMillis();

      int index1, index2;
      for( index1 = THRESHOLD - 1; index1 > 1; index1-- ) {
         myArray[ index1 ] = 42;
      }
      for( index1 = THRESHOLD - 1; index1 > 1; index1-- ) {
         for( index2 = index1 << 1; index2 < THRESHOLD; index2 += index1 ) {
            myArray[ index2 ] += 32;
         }
      }
      long result = 0;
      for( index1 = THRESHOLD - 1; index1 > 1; index1 -= 100 ) {
         result += myArray[ index1 ];
      }
      if( iteration > 0 ) {
         long delta = System.currentTimeMillis() - begin;
         sum += delta;
         min = Math.min(  min,  delta );
         max = Math.max(  max,  delta );
         System.out.println( iteration + ": " + result );
      }
   }

   public static void main( String[] args ) {
      for( int iteration = 0; iteration < 1000; ++iteration ) {
         perf( iteration );
      }
      long average = sum / 999;// the first is ignored
      System.out.println( "Min    : " + min     + " ms" );
      System.out.println( "Average: " + average + " ms" );
      System.out.println( "Max    : " + max     + " ms" );
   }
}

And the results of only 10 iterations are: 而且只有10次迭代的结果是:

With final: 最后:

Min    : 7645 ms
Average: 7659 ms
Max    : 7926 ms

Without final: 没有最终:

Min    : 7629 ms
Average: 7780 ms
Max    : 7957 ms

I suggest that readers run this test and post their results to compare. 我建议读者运行此测试并发布他们的结果进行比较。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM