简体   繁体   中英

C++ OpenMP for-loop global variable problems

I'm new to OpenMP and from what I have read about OpenMP 2.0, which comes standard with Microsoft Visual Studio 2010, global variables are considered troublesome and error prone when used in parallel programming. I have also been adopting this feeling since I have found very little on how to deal with global variables and static global variables efficiently, or at all for that matter.

I have this snippet of code which runs but because of the local variable created in the parallel block I don't get the answer I'm looking for. I get 8 different print outs (because that how many threads I have on my PC) instead of 1 answer. I know that it's because of the local variables "list" created in the parallel block but this code will not run if I move the "list" variable and make it a global variable. Actually the code does run but it never gives me an answer back. This is the sample code that I would like to modify to use a global "list" variable :

#pragma omp parallel
{
    vector<int> list;
#pragma  omp for
    for(int i = 0; i < 50000; i++) 
    {
        list.push_back(i);
    }
    cout << list.size() << endl;
}

Output:

6250
6250
6250
6250
6250
6250
6250
6250

They add up to 50000 but I did not get the one answer with 50000, instead it's divided up.

Solution:

    vector<int> list;
    #pragma omp parallel
{
    #pragma  omp for
    for(int i = 0; i < 50000; i++) 
    {
        cout << i << endl;
    #pragma omp critical
        {
            list.push_back(i);
        }
    }
}
cout << list.size() << endl;

According to the MSDN Documentation the parallel clause

Defines a parallel region, which is code that will be executed by multiple threads in parallel.

And since the list variable is declared inside this section every thread will have its own list.

On the other hand, the for pragma

Causes the work done in a for loop inside a parallel region to be divided among threads.

So the 50000 iterations will be split among threads but each thread will have its own list. I think what you are trying to do can be achieved by:

  1. Taking the list definition outside the "parallel" section.
  2. Protect the list.push_back statement with a critical section .

Try this:

vector<int> list;
#pragma omp parallel
{
#pragma  omp for
    for(int i = 0; i < 50000; i++) 
    {
#pragma omp critical
        {
            list.push_back(i);
        }
    }
}
cout << list.size() << endl;

I don't think you should get any speedup from OpenMP in this case because there will be contention for the critical section. A faster solution for this (if you don't care about the order of elements) would be for every thread to have its own list, and get those lists merged after the loop finishes. The implementation using std::list instead of std::vector would look cleaner in this case (because you wouldn't have to copy arrays).

Some apps are memory bound and not compute bound. Bottom line: check if you actually get a speedup from OpenMP.

Why you need the first pragma here? (#pragma omp parallel). I think that's the issue.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM