[英]Multi-threading run time - c++
I am working on a CPU router which finds the shortest path between source and target on a 2D mesh. 我正在使用一个CPU路由器,该路由器在2D网格上找到源和目标之间的最短路径。 I am using BFS expansion in combination with the 'greedy' cost algorithm to find the shortest path. 我将BFS扩展与“贪婪”成本算法结合使用,以找到最短路径。 I am also implementing this whole function by multi-threading. 我还通过多线程实现了整个功能。 I am expecting to see a decrease in function run time with increase in number of threads. 我期望随着线程数量的增加,函数运行时间会减少。 But until I have 5 threads running, the time trend is not followed (ie time for 2 threads should be less than time for 5 threads). 但是,直到我运行5个线程,才遵循时间趋势(即2个线程的时间应小于5个线程的时间)。 However with 6 threads running, upto 16 threads running, this time trend is seen. 但是,在运行6个线程(最多运行16个线程)的情况下,可以看到这种时间趋势。 What could be the possible reason? 可能是什么原因?
I am also using a concurrent queue. 我也在使用并发队列。
void *bfs(void* threadArg)
{
int i, numOfElements;
struct threadData *data;
element* currentNode;
data = (struct threadData *)threadArg;
currentNode = NULL;
for (i = 0; i < data->numElements; i++)
{
if (!adjacencyQ.empty())
{
currentNode = *(adjacencyQ.unsafe_begin());
adjacencyQ.try_pop(*(adjacencyQ.unsafe_begin()));
}
//if (!adjacencyQ.empty())
{
if ((currentNode->north != NULL))// && (currentNode->north->visited == false))
{
nodeMutex[currentNode->north->i][currentNode->north->j].lock();
if (currentNode->north->visited == false)
{
adjacencyQ.push(currentNode->north);
}
if (currentNode->north->cost > (currentNode->cost + currentNode->northCost))
{
currentNode->north->cost = currentNode->cost + currentNode->northCost;
currentNode->north->visited = true;
currentNode->north->visitedBy = currentNode;
}
nodeMutex[currentNode->north->i][currentNode->north->j].unlock();
}
if ((currentNode->south != NULL))// && (currentNode->south->visited == false))
{
nodeMutex[currentNode->south->i][currentNode->south->j].lock();
if (currentNode->south->visited == false)
{
adjacencyQ.push(currentNode->south);
}
if (currentNode->south->cost > (currentNode->cost + currentNode->southCost))
{
currentNode->south->cost = currentNode->cost + currentNode->southCost;
currentNode->south->visited = true;
currentNode->south->visitedBy = currentNode;
}
nodeMutex[currentNode->south->i][currentNode->south->j].unlock();
}
if ((currentNode->east != NULL))// && (currentNode->east->visited == false))
{
nodeMutex[currentNode->east->i][currentNode->east->j].lock();
if (currentNode->east->visited == false)
{
adjacencyQ.push(currentNode->east);
}
if (currentNode->east->cost > (currentNode->cost + currentNode->eastCost))
{
currentNode->east->cost = currentNode->cost + currentNode->eastCost;
currentNode->east->visited = true;
currentNode->east->visitedBy = currentNode;
}
nodeMutex[currentNode->east->i][currentNode->east->j].unlock();
}
if ((currentNode->west != NULL))// && (currentNode->west->visited == false))
{
nodeMutex[currentNode->west->i][currentNode->west->j].lock();
if (currentNode->west->visited == false)
{
adjacencyQ.push(currentNode->west);
}
if (currentNode->west->cost > (currentNode->cost + currentNode->westCost))
{
currentNode->west->cost = currentNode->cost + currentNode->westCost;
currentNode->west->visited = true;
currentNode->west->visitedBy = currentNode;
}
nodeMutex[currentNode->west->i][currentNode->west->j].unlock();
}
if (!adjacencyQ.empty())
{
currentNode = *(adjacencyQ.unsafe_begin());
}
}
}
pthread_exit(NULL);
return NULL;
} }
There are three possible reasons: 有三个可能的原因:
With all of them a "poor man profiler" would be a reasonable tool to start from (unless you have VTune installed, which does the same but automatically): run gstack
multiple times while running the app and look at stack traces, they may point towards where the time is spent. 对于所有这些工具,“穷人分析器”将是一个gstack
工具(除非您安装了VTune,该工具会自动执行相同的操作):在运行应用程序时多次运行gstack
并查看堆栈跟踪,他们可能会指出花费时间的地方。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.