简体   繁体   中英

CyclicBarrier Wasting Time

I'm implementing a parallel algorithm. Without CyclicBarrier I can chug through the work in half the Sequential Time. Using CyclicBarrier makes it take up to 100 times longer. I'll include my thread calls and thread function so you can see what is going on and try to help me out. The CyclicBarrier is reused and new threads are spawned everytime. For some reason the TRY(barrier.await;) bit is spinning for a LONG time.

//Threads use this ...
private class threadILoop implements Runnable {
    protected int start, end, j, k;
    public threadILoop(int start,int end,int j,int k){
        this.start = start;
        this.end = end;
        this.j = j;
        this.k = k;
    }
    public void run() {
        for (int z = start; z < end; z++) {

            int zxj = z ^ j;
            if(zxj > z){
                if((z&k) == 0 && (data[z] > data[zxj]))
                    swap(z, zxj);
                if((z&k) != 0 && (data[z] < data[zxj]))
                    swap(z, zxj);
            }

            try{barrier.await();}
            catch (InterruptedException ex) { return; }
            catch (BrokenBarrierException ex) {return; }
        }
    }
}
//Main Driver here, where the CyclicBarrier gets allocated and the threads //are spawned from. 
 private void loopSort() throws InterruptedException {
        //print(data);
        barrier = new CyclicBarrier(N_THREADS);
        int kMax = data.length;
        for(int k = 2; k<=kMax; k*=2){
            for (int j = k/2; j > 0; j/=2) {

                int piece = data.length/N_THREADS;

                if(j > N_THREADS) {
                    //DIVIDE UP DATA SPACE FOR THREADS -> do work faster
                    int start = 0;
                    for(int i = 0; i < N_THREADS; i++)
                        {
                            int end =  i == N_THREADS - 1 ? data.length : start + piece;
                            threads[i] = new Thread(new threadILoop(start, end, j, k));
                            //threads[i].start();
                            start = end;
                        }

                    for(int i = 0; i < N_THREADS; i++)
                        {
                            threads[i].start();
                        }




                    // print(data);

                    for(int i = 0; i < N_THREADS; i++)
                        {
                            threads[i].join();
                        }
                }





You are having the barrier too far into the loop, right now each thread gets a range of elements to process and they all process one element, wait for all treads, the process the next, wait and so on. In this case the overhead of waiting and communicating between the threads becomes much more work than the actual processing.

Try to process more elements before aligning up with the other threads such as processin the whole range, then waiting.

//Threads use this ...
private class threadILoop implements Runnable {
    protected int start, end, j, k;
    public threadILoop(int start,int end,int j,int k){
        this.start = start;
        this.end = end;
        this.j = j;
        this.k = k;
    }
    public void run() {
        for (int z = start; z < end; z++) {    
            int zxj = z ^ j;
            if(zxj > z){
                if((z&k) == 0 && (data[z] > data[zxj]))
                    swap(z, zxj);
                if((z&k) != 0 && (data[z] < data[zxj]))
                    swap(z, zxj);
            }
            // Wait moved from here
        }
        // To here (outside the inner loop)
        try{barrier.await();}
        catch (InterruptedException ex) { return; }
        catch (BrokenBarrierException ex) {return; }
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM