简体   繁体   中英

OpenMP for loop with specific threads

I'm new in parallel programming. I'm trying to make point cloud processing process parallel. I share with my program structure below. Firstly, I separate the point cloud into the partial clouds. My aim is that every thread must call the fillFrustumCloud() function separately.

int num_threads = 12;

std::vector<CloudColored::Ptr> vector_colored_projected_clouds(num_threads);
std::vector<Cloud::Ptr> vector_projected_clouds(num_threads);

omp_set_num_threads(num_threads);

// private( ) shared()
#pragma omp parallel  shared(vector_colored_projected_clouds,vector_projected_clouds)
{
    
    for(int i=0; i<num_threads; i++)
    {

        #pragma omp critical
        {
            std::cout << "Thread id: " << omp_get_thread_num() << " loop id: " << i <<  std::endl;
        }

        const unsigned int  start_index = cloud_in->size()/num_threads*i;
        const unsigned int  end_index = cloud_in->size()/num_threads*(i+1);

        Cloud::Ptr partial_cloud(new Cloud);

        if(i==num_threads-1)
        {
            partial_cloud->points.assign(cloud_in->points.begin()+start_index, cloud_in->points.end());
        }else{
            partial_cloud->points.assign(cloud_in->points.begin()+start_index, cloud_in->points.begin()+end_index);
        }

            LidcamHelpers::fillFrustumCloud(partial_cloud, mat_point_transformer, img_size, vector_colored_projected_clouds,
                                            vector_projected_clouds, i, interested_detections, id, reshaped_img);
    }
}

but output is:

Thread id: 0 loop id: 0
Thread id: 1 loop id: 0
Thread id: 2 loop id: 0
Thread id: 3 loop id: 0
Thread id: 0 loop id: 1
Thread id: 1 loop id: 1
Thread id: 2 loop id: 1
Thread id: 3 loop id: 1
Thread id: 0 loop id: 2
Thread id: 3 loop id: 2
Thread id: 2 loop id: 2
Thread id: 1 loop id: 2
Thread id: 3 loop id: 3
Thread id: 1 loop id: 3
Thread id: 2 loop id: 3
Thread id: 0 loop id: 3

According to my aim, it should be like this:

Thread id: 0 loop id: 0
Thread id: 1 loop id: 1
Thread id: 2 loop id: 2

Note that: I pass the vector_colored_projected_clouds and vector_projected_clouds into the function by reference in order to store the result. I guess they should be shared variables.

This #pragma omp parallel constructor will create a parallel region, with as many threads as you have set it up. Hence, when you do:

#pragma omp parallel
{
    for(int i=0; i<num_threads; i++)
    {
       ... 
    }
}

every thread in the parallel region will execute all the iterations of the loop. That is why you have 16 outputted lines ( ie, 4 threads x 4 loop iterations).

If you want to distribute the iterations of a loop among threads you should use the #pragma omp for instead. So in your code you can either do:

#pragma omp parallel
{
    #pragma omp for
    for(int i=0; i<num_threads; i++)
    {
       ... 
    }
}

or

#pragma omp parallel for
for(int i=0; i<num_threads; i++)
{
   ... 
}

Since, you only want to distribute the iterations of the loop among threads, you can use the latter ( ie, #pragma omp parallel for ).

It looks as though you are using

#pragma omp critical
{
    std::cout << "Thread id: " << omp_get_thread_num() << " loop id: " << i <<  std::endl;
}

for debugging purposes. Bear in mind, however, that even with the critical region, the order in which threads will output is non-deterministic. If you rather that threads would output deterministically, use #pragma omp ordered instead of critical. The ordered constructor will enforce that the chunk of code that it wraps around will be executed in the same order that would have been executed if the code was executed sequentially.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM