简体   繁体   English

多线程运行时-C ++

[英]Multi-threading run time - c++

I am working on a CPU router which finds the shortest path between source and target on a 2D mesh. 我正在使用一个CPU路由器,该路由器在2D网格上找到源和目标之间的最短路径。 I am using BFS expansion in combination with the 'greedy' cost algorithm to find the shortest path. 我将BFS扩展与“贪婪”成本算法结合使用,以找到最短路径。 I am also implementing this whole function by multi-threading. 我还通过多线程实现了整个功能。 I am expecting to see a decrease in function run time with increase in number of threads. 我期望随着线程数量的增加,函数运行时间会减少。 But until I have 5 threads running, the time trend is not followed (ie time for 2 threads should be less than time for 5 threads). 但是,直到我运行5个线程,才遵循时间趋势(即2个线程的时间应小于5个线程的时间)。 However with 6 threads running, upto 16 threads running, this time trend is seen. 但是,在运行6个线程(最多运行16个线程)的情况下,可以看到这种时间趋势。 What could be the possible reason? 可能是什么原因?

I am also using a concurrent queue. 我也在使用并发队列。

void *bfs(void* threadArg)
{
int i, numOfElements;
struct threadData *data;
element* currentNode;

data = (struct threadData *)threadArg;
currentNode = NULL;

for (i = 0; i < data->numElements; i++)
{

    if (!adjacencyQ.empty())
    {
        currentNode = *(adjacencyQ.unsafe_begin());
        adjacencyQ.try_pop(*(adjacencyQ.unsafe_begin()));
    }

    //if (!adjacencyQ.empty())
    {
        if ((currentNode->north != NULL))// && (currentNode->north->visited == false))
        {
            nodeMutex[currentNode->north->i][currentNode->north->j].lock();
            if (currentNode->north->visited == false)
            {
                adjacencyQ.push(currentNode->north);
            }

            if (currentNode->north->cost > (currentNode->cost + currentNode->northCost))
            {
                currentNode->north->cost = currentNode->cost + currentNode->northCost;
                currentNode->north->visited = true;
                currentNode->north->visitedBy = currentNode;
            }
            nodeMutex[currentNode->north->i][currentNode->north->j].unlock();
        }

        if ((currentNode->south != NULL))// && (currentNode->south->visited == false))
        {
            nodeMutex[currentNode->south->i][currentNode->south->j].lock();
            if (currentNode->south->visited == false)
            {
                adjacencyQ.push(currentNode->south);
            }

            if (currentNode->south->cost > (currentNode->cost + currentNode->southCost))
            {
                currentNode->south->cost = currentNode->cost + currentNode->southCost;
                currentNode->south->visited = true;
                currentNode->south->visitedBy = currentNode;
            }
            nodeMutex[currentNode->south->i][currentNode->south->j].unlock();
        }

        if ((currentNode->east != NULL))// && (currentNode->east->visited == false))
        {
            nodeMutex[currentNode->east->i][currentNode->east->j].lock();
            if (currentNode->east->visited == false)
            {
                adjacencyQ.push(currentNode->east);
            }

            if (currentNode->east->cost > (currentNode->cost + currentNode->eastCost))
            {
                currentNode->east->cost = currentNode->cost + currentNode->eastCost;
                currentNode->east->visited = true;
                currentNode->east->visitedBy = currentNode;
            }
            nodeMutex[currentNode->east->i][currentNode->east->j].unlock();
        }

        if ((currentNode->west != NULL))// && (currentNode->west->visited == false))
        {
            nodeMutex[currentNode->west->i][currentNode->west->j].lock();
            if (currentNode->west->visited == false)
            {
                adjacencyQ.push(currentNode->west);
            }

            if (currentNode->west->cost > (currentNode->cost + currentNode->westCost))
            {
                currentNode->west->cost = currentNode->cost + currentNode->westCost;
                currentNode->west->visited = true;
                currentNode->west->visitedBy = currentNode;
            }
            nodeMutex[currentNode->west->i][currentNode->west->j].unlock();
        }

        if (!adjacencyQ.empty())
        {
            currentNode = *(adjacencyQ.unsafe_begin());
        }
    }
}

pthread_exit(NULL);
return NULL;

} }

There are three possible reasons: 有三个可能的原因:

  1. Contention 争夺
  2. CPU starvation CPU饥饿
  3. Bugs in the implementation - peer-review the code, fix the bug 实施中的错误-对等检查代码,修复错误

With all of them a "poor man profiler" would be a reasonable tool to start from (unless you have VTune installed, which does the same but automatically): run gstack multiple times while running the app and look at stack traces, they may point towards where the time is spent. 对于所有这些工具,“穷人分析器”将是一个gstack工具(除非您安装了VTune,该工具会自动执行相同的操作):在运行应用程序时多次运行gstack并查看堆栈跟踪,他们可能会指出花费时间的地方。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM