简体   繁体   中英

Java Optimizing arithmetic and Assignment Operators for large input

I have a piece of code that must run extremely fast in terms of clock speed. The algorithm is already in O(N). It takes 2seconds, it needs to take 1s. For most A.length inputs ~ 100,000 it takes .3s unless a particular line of code is invoked an extreme number of times. (For an esoteric programming challenge)

It uses a calculation of the arithmetic series that 1,2,..N -> 1,3,4,10,15.. that can be represented by n*(n+1)/2 I loop through this equation hundreds of thousands of times. I do not have access to the input, nor can I display it. The only information I am able to get returned is the time it took to run. particularly the equation is:

s+=(n+c)-((n*(n+1))/2);

s and c can have values range from 0 to 1Billion

n can range 0 to 100,000

What is the most efficient way to write this statement in terms of clock speed? I have heard division takes more time then multiplication, but beyond that I could not determine whether writing this in one line or multiple assignment lines was more efficient. Dividing and multiplying versus multiplying and then dividing? Also would creating custom integers types significantly help?

Edit as per request, full code with small input case (sorry if it's ugly, I've just kept stripping it down):

public static void main(String[] args) {

        int A[]={3,4,8,5,1,4,6,8,7,2,2,4};//output 44
        int K=6;
        //long start = System.currentTimeMillis();;
        //for(int i=0;i<100000;i++){
            System.out.println(mezmeriz4r(A,K));
        //}
        //long end = System.currentTimeMillis();;

//      System.out.println((end - start) + " ms");

    }
    public static int mezmeriz4r(int[]A,int K){
        int s=0;
        int ml=s;
        int mxl=s;
        int sz=1;
        int t=s;
        int c=sz;
        int lol=50000;
        int end=A.length;
        for(int i=sz;i<end;i++){
            if(A[i]>A[mxl]){
                mxl=i;
            }else if(A[i]<A[ml]){
                ml=i;
            }
            if(Math.abs(A[ml]-A[mxl])<=K){
                sz++;
                if(sz>=lol)return 1000000000;
                if(sz>1){
                    c+=sz;
                }
            }else{
                if(A[ml]!=A[i]){
                    t=i-ml;
                    s+=(t+c)-((t*(t+1))/(short)2);
                    i=ml;
                    ml++;
                    mxl=ml;
                }else{
                    t=i-mxl;
                    s+=(t+c)-((t*(t+1))/(short)2);
                    i=mxl;
                    mxl++;
                    ml=mxl;
                }
                c=1;
                sz=0;
            }
        }
        if(s>1000000000)return 1000000000;
        return s+c;
    }

Returned from Challenge:

Detected time complexity:

O(N)

test time result

example example test 0.290 s. OK

single single element 0.290 s. OK

double two elements 0.290 s. OK

small_functional small functional tests 0.280 s. OK

small_random small random sequences length = ~100 0.300 s. OK

small_random2 small random sequences length = ~100 0.300 s. OK

medium_random chaotic medium sequences length = ~3,000 0.290 s. OK

large_range large range test, length = ~100,000 2.200 s. TIMEOUT ERROR running time: >2.20 sec., time limit: 1.02 sec.

large_random random large sequences length = ~100,000 0.310 s. OK

large_answer test with large answer 0.320 s. OK

large_extreme all maximal value = ~100,000 0.340 s. OK

With a little algebra, you can simply the expression (n+c)-((n*(n+1))/2) to c-((n*(n-1))/2) to remove an addition operation. Then you can replace the division by 2 with a bit-shift to the right by 1 , which is faster than division. Try replacing

s+=(n+c)-((n*(n+1))/2);

with

s+=c-((n*(n-1))>>1);

I would try the following and profile the code after each change to check if there is any gain in speed.


replace:

if(Math.abs(A[ml]-A[mxl])<=K)

by

int diff = A[ml]-A[mxl];
if(diff<=K && diff>=-K)

replace

/2

by

>>1

replace

ml++;
mxl=ml;

by

mxl=++ml;

Maybe avoid array access of the same element (internal boundary checks of java may take some time)

So staore at least A[i] in a local varianble.

在for循环中删除System.out.println():)您会惊讶于计算速度有多快

Nested assignments, ie instead of

t=i-ml;
s+=(t+c)-((t*(t+1))/(short)2);
i=ml;
ml++;
mxl=ml;

something like

s+=((t=i-ml)+c);
s-=((t*(t+1))/(short)2);
i=ml;
mxl=++ml;

sometimes occurs in OpenJDK sources. It mainly results in replacing *load bytecode instructions with *dup s. According to my experiments, it really gives a very little speedup, but it is ultra hadrcore, I don't recommend to write such code manually.

I dont have access to validate all inputs. and time range. but this one runs O(N) for sure. and have improved. run and let me know your feedback.i will provide details if necessary

public static int solution(int[]A,int K){
    int minIndex=0;
    int maxIndex=0;
    int end=A.length;
    int slize = end;
    int startIndex = 0;
    int diff = 0;
    int minMaxIndexDiff = 0;
    for(int currIndex=1;currIndex<end;currIndex++){
        if(A[currIndex]>A[maxIndex]){
            maxIndex=currIndex;
        }else if(A[currIndex]<A[minIndex]){
            minIndex=currIndex;
        }
        if( (A[maxIndex]-A[minIndex]) >K){
            minMaxIndexDiff= currIndex- startIndex;
            if (minMaxIndexDiff > 1){
                slize+= ((minMaxIndexDiff*(minMaxIndexDiff-1)) >> 1);
                if (diff > 0 ) {
                    slize = slize + (diff * minMaxIndexDiff);
                }
            }

            if (minIndex == currIndex){
                diff = currIndex - (maxIndex + 1);
            }else{
                diff = currIndex - (minIndex + 1);
            }
            if (slize > 1000000000) {
                return 1000000000;
            }
            minIndex = currIndex;
            maxIndex = currIndex;
            startIndex = currIndex;
        }
    }
    if ( (startIndex +1) == end){
        return slize;
    }
    if (slize > 1000000000) {
        return 1000000000;
    }
    minMaxIndexDiff= end- startIndex;
    if (minMaxIndexDiff > 1){
        slize+= ((minMaxIndexDiff*(minMaxIndexDiff-1)) >> 1);
        if (diff > 0 ) {
            slize = slize + (diff * minMaxIndexDiff);
        }
    }

    return slize;
}

I would try to elimnate this line if(Math.abs(A[ml]-A[mxl])<= by a faster self calculated abs version, which is inlined, not a method call!

The cast to (short) does not help, but try the right shift operator X >>1 instead x / 2

removing the System.out.println() can speed up by factor of 1000. But be carefull otherwise your whole algorithm can be removed by the VM becasue you dont use it. Old code:

for(int i=0;i<100000;i++){
            System.out.println(mezmeriz4r(A,K));
}

New code:

int dummy = 0;
    for(int i=0;i<100000;i++){
          dummy =   mezmeriz4r(A,K);
    }
//Use dummy otherwise optimisation can remove  mezmeriz4r
System.out.print("finished: " + dummy);

I would create a C version first and see, how fast it can go with "direct access to the metal". Chances are, you are trying to optimize calculation which is already optimized to the limit.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM