简体   繁体   中英

Why are local variable length for-loops faster? Doesn't branch prediction reduce the effect of lookup times?

A while back, I was reading up on some Android performance tips when I came by:

Foo[] mArray = ...

public void zero() {
    int sum = 0;
    for (int i = 0; i < mArray.length; ++i) {
        sum += mArray[i].mSplat;
    }
}

public void one() {
    int sum = 0;
    Foo[] localArray = mArray;
    int len = localArray.length;

    for (int i = 0; i < len; ++i) {
        sum += localArray[i].mSplat;
    }
}

Google says:

zero() is slowest, because the JIT can't yet optimize away the cost of getting the array length once for every iteration through the loop.

one() is faster. It pulls everything out into local variables, avoiding the lookups. Only the array length offers a performance benefit.

Which made total sense. But after thinking way too much about my computer architecture exam I remembered Branch Predictors :

a branch predictor is a digital circuit that tries to guess which way a branch (eg an if-then-else structure) will go before this is known for sure. The purpose of the branch predictor is to improve the flow in the instruction pipeline.

Isn't the computer assuming i < mArray.length is true and thus, computing the loop condition and the body of the loop in parallel (and only predicting the wrong branch on the last loop) , effectively removing any performance loses?

I was also thinking about Speculative Execution :

Speculative execution is an optimization technique where a computer system performs some task that may not be actually needed... The objective is to provide more concurrency...

In this case, the computer would be executing the code both as if the loop had finished and as if it was still going concurrently , once again, effectively nullifying any computational costs associated with the condition (since the computer's already performing computations for the future while it computes the condition)?

Essentially what I'm trying to get at is the fact that, even if the condition in zero() takes a little longer to compute than one() , the computer is usually going to compute the correct branch of code while it's waiting to retrieve the answer to the conditional statement anyway, so the performance loss in the lookup to myAray.length shouldn't matter (that's what I thought anyway).

Is there something I'm not realizing here?


Sorry about the length of the question.

Thanks in advance.

The site you linked to notes:

zero() is slowest, because the JIT can't yet optimize away the cost of getting the array length once for every iteration through the loop.

I haven't tested on Android, but I'll assume that this is true for now. What this means is that for every iteration of the loop the CPU has to execute code that loads the value of mArray.length from memory. The reason is that the length of the array may change so the compiler can't treat it as a static value.

Whereas in the one() option the programmer explicitly sets the len variable based on knowledge that the array length won't change. Since this is a local variable the compiler can store it in a register rather than loading it from memory in each loop iteration. So this will reduce the number of instructions executed in the loop, and it will make the branch easier to predict.

You are right that branch prediction helps reduce the overhead associated with the loop condition check. But there is still a limit to how much speculation is possible so executing more instructions in each loop iteration can incur additional overhead. Also many mobile processors have less advanced branch predictors and don't support as much speculation.

My guess is that on a modern desktop processor using an advanced Java JIT like HotSpot that you would not see a 3X performance difference. But I don't know for certain, it could be an interesting experiment to try.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM