简体   繁体   English

C ++中具有OpenMP的并行算法

[英]Parallelize Algorithm with OpenMP in C++

my problem is this: 我的问题是这样的:

I want to solve TSP with the Ant Colony Optimization Algorithm in C++. 我想用C ++中的蚁群优化算法解决TSP。 Right now Ive implemented a algorithm that solve this problem iterative. 现在,我已经实现了一种算法,可以迭代解决该问题。

For example: I generate 500 ants - and they find their route one after the other. 例如:我生成500只蚂蚁-他们彼此之间找到了路线。 Each ant starts not until the previous ant finished. 直到前一个蚂蚁结束,每个蚂蚁才开始。

Now I want to parallelize the whole thing - and I thought about using OpenMP. 现在,我想将整个事情并行化-我考虑过使用OpenMP。

So my first question is: Can I generate a large number of threads that work simultaneously (for the number of ants > 500)? 因此,我的第一个问题是:是否可以生成大量同时工作的线程(蚂蚁数量> 500)?

I already tried something out. 我已经尝试过了。 So this is my code from my main.cpp: 这是我main.cpp中的代码:

 #pragma omp parallel for       
    for (auto ant = antarmy.begin(); ant != antarmy.end(); ++ant) {
        #pragma omp ordered
        if (ant->getIterations() < ITERATIONSMAX) {
            ant->setNumber(currentAntNumber);
            currentAntNumber++;
            ant->antRoute();
        }

    }

And this is the code in my Ant class that is "critical" because each Ant reads and writes into the same Matrix (pheromone-Matrix): 这是我的Ant类中的“关键”代码,因为每个Ant都读取和写入相同的Matrix(信息素矩阵):

 void Ant::antRoute()
 {
     this->route.setCity(0, this->getStartIndex());
     int nextCity = this->getNextCity(this->getStartIndex());
     this->routedistance += this->data->distanceMatrix[this->getStartIndex()][nextCity];
     int tempCity;
     int i = 2;
     this->setProbability(nextCity);
     this->setVisited(nextCity);
     this->route.setCity(1, nextCity);
     updatePheromone(this->getStartIndex(), nextCity, routedistance, 0);

     while (this->getVisitedCount() < datacitycount) {
         tempCity = nextCity;
         nextCity = this->getNextCity(nextCity);
         this->setProbability(nextCity);
         this->setVisited(nextCity);
         this->route.setCity(i, nextCity);
         this->routedistance += this->data->distanceMatrix[tempCity][nextCity];
         updatePheromone(tempCity, nextCity, routedistance, 0);
         i++;
     }

     this->routedistance += this->data->distanceMatrix[nextCity][this->getStartIndex()];
     // updatePheromone(-1, -1, -1, 1);
     ShortestDistance(this->routedistance);
     this->iterationsshortestpath++;
}

void Ant::updatePheromone(int i, int j, double distance, bool reduce)
{

     #pragma omp critical(pheromone) 

     if (reduce == 1) {
        for (int x = 0; x < datacitycount; x++) {
             for (int y = 0; y < datacitycount; y++) {
                 if (REDUCE * this->data->pheromoneMatrix[x][y] < 0)
                     this->data->pheromoneMatrix[x][y] = 0.0;
                 else
                    this->data->pheromoneMatrix[x][y] -= REDUCE * this->data->pheromoneMatrix[x][y];
             }
         }
     }
     else {

         double currentpheromone = this->data->pheromoneMatrix[i][j];
         double updatedpheromone = (1 - PHEROMONEREDUCTION)*currentpheromone + (PHEROMONEDEPOSIT / distance);

         if (updatedpheromone < 0.0) {
            this->data->pheromoneMatrix[i][j] = 0;
            this->data->pheromoneMatrix[j][i] = 0;
         }
          else {
             this->data->pheromoneMatrix[i][j] = updatedpheromone;
             this->data->pheromoneMatrix[j][i] = updatedpheromone;
         }
     }

 }

So for some reasons the omp parallel for loop wont work on these range-based loops. 因此,出于某些原因,omp并行for循环无法在这些基于范围的循环上运行。 So this is my second question - if you guys have any suggestions on the code how the get the range-based loops done im happy. 所以这是我的第二个问题-你们是否对代码有任何建议,如何获得基于范围的循环不满意。

Thanks for your help 谢谢你的帮助

So my first question is: Can I generate a large number of threads that work simultaneously (for the number of ants > 500)? 因此,我的第一个问题是:是否可以生成大量同时工作的线程(蚂蚁数量> 500)?

In OpenMP you typically shouldn't care how many threads are active, instead you make sure to expose enough parallel work through work-sharing constructs such as omp for or omp task . 在OpenMP中,您通常不必关心活动的线程数,而应确保通过工作共享结构(如omp foromp task公开足够的并行工作。 So while you may have a loop with 500 iterations, your program could be run with anything between one thread and 500 (or more, but they would just idle). 因此,尽管您可能具有500次迭代的循环,但是您的程序可以在一个线程和500个线程之间运行(甚至更多,但它们只是空闲)。 This is a difference to other parallelization approaches such as pthreads where you have to manage all the threads and what they do. 这与其他并行化方法(例如pthread)不同,在pthread中,您必须管理所有线程及其作用。

Now your example uses ordered incorrectly. 现在,您的示例使用了不正确的ordered Ordered is only useful if you have a small part of your loop body that needs to be executed in-order. 只有在循环主体的一小部分需要按顺序执行时,有序命令才有用。 Even then it can be very problematic for performance. 即使这样,性能也会非常成问题。 Also you need to declare a loop to be ordered if you want to use ordered inside. 另外,如果要在内部使用ordered则需要声明要ordered的循环。 See also this excellent answer . 另请参见此出色答案

You should not use ordered. 您不应该使用命令。 Instead make sure that the ants know there number beforehand, write the code such that they don't need a number, or at the very least that the order of numbers doesn't matter for ants. 相反,要确保蚂蚁事先知道那里的number ,编写代码以使他们不需要数字,或者至少对蚂蚁来说数字的顺序无关紧要。 In the latter case you can use omp atomic capture . 在后一种情况下,可以使用omp atomic capture

As to the access to shared data. 至于访问共享数据。 Try to avoid it as much as possible. 尽量避免它。 Adding omp critical is a first step to get a correct parallel program, but often leads to performance problems. 添加omp critical是获得正确的并行程序的第一步,但通常会导致性能问题。 Measure your parallel efficiency, use parallel performance analysis tools to find out if this is the case for you. 测量您的并行效率,使用并行性能分析工具找出是否适合您。 Then you can use atomic data access or reduction (each threads has their own data they work on and only after the main work is finished, data from all threads is merged). 然后,您可以使用原子数据访问或精简(每个线程都有它们自己的数据,并且仅在完成主要工作后,所有线程的数据才会合并)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM