简体   繁体   中英

Why does test1() run much faster than test2()?

import java.util.Random;


public class Test{
    static int r = new Random().nextInt(2);
    static int a(){
        return r==1 ? 1 :0;
    }

    public static void test1() throws  Exception {
        //
        System.out.println(1403187139018L);
        for (int i = 0; i <   1073741824; i++) {}//*

        // Thread.sleep(20000);
        long d = 0;

        for (int j = 0; j < 10; j++) {
            long y = System.currentTimeMillis();

            for (int x = 0; x < 1073741823; x++) {
                d += r==0?1:0;
            }
            System.out.println((System.currentTimeMillis() -y));
        }
    }

    public static void test2()  throws  Exception{

        // Thread.sleep(20000);
        long d = 0;

        for (int j = 0; j < 10; j++) {
            long y = System.currentTimeMillis();

            for (int x = 0; x < 1073741824; x++) {
                d += r==0?1:0;
            }
            System.out.println((System.currentTimeMillis() -y));

            // System.out.println("time:"+ (System.currentTimeMillis() - y));
        }
    }

    public static void main(String[] args) throws  Exception{
        // Thread.sleep(20000);

        test1();
        test2();

    }

}

When I run the above code, I get this output:

32
26
28
28
32
29
35
33
30
31
1321
1308
1324
1277
1348
1321
1337
1413
1287
1331

Why is test1 much faster ?

There is no difference except the followings:

System.out.println(1403187139018L);
for (int i = 0; i <   1073741824; i++) {}//*

Also, the time cost for test1 is 25-35 milliseconds, which I find unbelievable. I wrote the same code in C and it needed about 4 seconds to run every for loop.

This behavior seems strange. How do I know when to add:

System.out.println(1403187139018L);
for (int i = 0; i <   1073741824; i++) {}//*

Also, if I change

r==0?1:0

to

a()

then test2() runs faster than test1().

The output I get is:

1403187139018
3726
3729
3619
3602
3797
4362
4498
3816
4143
4368
1673
1386
1388
1323
1296
1337
1294
1283
1235
1460

the original legacy code: ...

long t = System.currentTimeMillis();
MappedByteBuffer mbb = map(new File("temp.mmp"), 1024L * 1024 * 1024);

System.out.println("load " + (System.currentTimeMillis() - t));//*
for (int i = 0; i < 2014L * 1024 * 1024; i++) {}//*
int d = 0;
for (int j = 0; j < 10; j++) {
    t = System.currentTimeMillis();
    mbb.position(0);
    mbb.limit(mbb.capacity());

    for (int i = 0; i < mbb.capacity(); i++) {
        d += mbb.get();
    }

    ....
}
System.out.println(d);

The empty loop probably triggers JIT compilation in the first method. And the JIT is smart enough to realize that your code doesn't do anything useful except getting the current time and printing time difference. So it optimizes the useless code by not running it at all.

If you write actual, useful code, the JIT will do the right thing. Don't try to mess with it by adding empty loops.

There are too many factors that affect JIT compilations:

  1. Execution statistics. While interpreter runs a method it collects different statistics: which paths are executed, which branches are taken, which class instances are seen etc. In test1 the statistics is collected inside the first (empty) loop and thus fools the JIT compiler about the real execution scenario.
  2. Class initialization and uncommon traps. When you remove the first System.out.println from test1 , not all classes related to printing are initialized. An attempt to invoke a method of an uninitialized class causes uncommon trap which leads to deoptimization and further recompilation of the method using new knowledge.
  3. The wrong statistics collected in test1 plays the bad joke when you replace r==0?1:0 with a() . In compiled test1 the method a() has never been executed before and thus has not had a chance to be optimized. That's why it works slower than test2 which has been compiled with the knowledge of a() .

Of course, it is hard to predict all factors affecting JIT-compilation when trying to write a microbenchmark from scratch. That's why the recommended way to benchmark your code is using special frameworks where most of these problems have been solved already. I personally suggest JMH .

When code is optimised it uses the information it has about how the code has run before to optimise it. In test1() the first loop triggers the whole method to be optimised, however there is no information about how the second loop will be run so it is not optimised as well as test2()

What I expect should happen is that the method is re-optimised however the code has to detect that an assumption it made the first time is not valid.

Guessing what might be different, the test2() could have been loop unrolled whereas in test1() it has not. This could explain the difference in performance.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM