how to launch multiprocess in While loop in openmp + mpi

Question

I have a iterative algorithm that needs openmp and MPI to speed up. Here is my code

#pragma omp parallel 
while (allmax > E) /* The precision requirement */
{
    lmax = 0.0;
    for(i = 0; i < m; i ++)
    {
        if(rank * m + i < size)
        {
            sum = 0.0;
            for(j = 0; j < size; j ++)
            {
                if (j != (rank * m + i)) sum = sum + a(i, j) * v(j);
            }
            /* computes the new elements */
            v1(i) = (b(i) - sum) / a(i, rank * m + i);
            #pragma omp critical
            {
                if (fabs(v1(i) - v(i)) > lmax)
                     lmax = fabs(v1(i) - v(rank * m + i));
            }
        }
     }
    /*Find the max element in the vector*/           
    MPI_Allreduce(&lmax, &allmax, 1, MPI_FLOAT, MPI_MAX, MPI_COMM_WORLD);
    /*Gather all the elements of the vector from all nodes*/
    MPI_Allgather(x1.data(), m, MPI_FLOAT, x.data(), m, MPI_FLOAT, MPI_COMM_WORLD);
    #pragma omp critical
    {
        loop ++;
    }
}

But when it got no speedup, even couldn't get the right answer, what wrong with my code? Does openmp doesn't support while loop? Thank you!

Answer 1

Regarding your question, the #pragma omp parallel construct simply spawns the OpenMP threads and execute the block after it in parallel. And yes, it supports executing while loops as this minimalistic example.

#include <stdio.h>
#include <omp.h>

void main (void)
{
    int i = 0;
    #pragma omp parallel
    while (i < 10)
    {
        printf ("Hello. I am thread %d and i is %d\n", omp_get_thread_num(), i);
        #pragma omp atomic
        i++;
    }
}

However, as Tim18 and yourself mention, there are several caveats in your code. Each thread need to access its own data and the MPI calls here are race conditions because they are executed by all the threads.

What about this change in your code?

while (allmax > E) /* The precision requirement */
{
    lmax = 0.0;

    #pragma omp parallel for shared (m,size,rank,v,v1,b,a,lmax) private(sum,j)
    for(i = 0; i < m; i ++)
    {
        if(rank * m + i < size)
        {
            sum = 0.0;
            for(j = 0; j < size; j ++)
            {
                if (j != (rank * m + i)) sum = sum + a(i, j) * v[j];
            }
            /* computes the new elements */
            v1[i] = (b[i] - sum) / a(i, rank * m + i);

            #pragma omp critical
            {
                if (fabs(v1[i] - v[i]) > lmax)
                    lmax = fabs(v1[i] - v(rank * m + i));
            }
        }
    }

    /*Find the max element in the vector*/           
    MPI_Allreduce(&lmax, &allmax, 1, MPI_FLOAT, MPI_MAX, MPI_COMM_WORLD);

    /*Gather all the elements of the vector from all nodes*/
    MPI_Allgather(x1.data(), m, MPI_FLOAT, x.data(), m, MPI_FLOAT, MPI_COMM_WORLD);

    loop ++;
}

The main while loop is executed serially but as soon as the loop starts OpenMP will spawn the work on multiple threads when encountering the #pragma omp parallel for . The usage of the #pragma omp parallel for (rather than #pragma omp parallel ) will automatically distribute the work of the loop to the working threads. Also, you need to specify the type of variables (shared, private) into the parallel region. I have guessed here according to your code.

At the end of the while loop the MPI calls are invoked by the master thread only.

how to launch multiprocess in While loop in openmp + mpi

Question

1 answers

solution1
3 ACCPTED 2016-08-12 10:51:37

how to launch multiprocess in While loop in openmp + mpi

Question

1 answers

solution1 3 ACCPTED 2016-08-12 10:51:37

solution1
3 ACCPTED 2016-08-12 10:51:37