简体   繁体   English

尝试与OpenMP并行处理链接列表数据

[英]Trying to process linked list data in parallel with OpenMP

I am trying to process linked list data in parallel with OpenMP in C++. 我正在尝试在C ++中与OpenMP并行处理链表数据。 I'm pretty new to OpenMP and pretty rusty with C++. 我对OpenMP还是很陌生,对C ++却很生疏。 What I want to do is get several threads to break up the linked list, and output the data of the Nodes in their particular range. 我想要做的是让几个线程分解链表,并输出其特定范围内的节点数据。 I don't care about the order in which the output occurs. 我不在乎输出发生的顺序。 If I can get this working, I want to replace the simple output with some actual processing of the Node data. 如果可以正常工作,我想用对Node数据的一些实际处理来替换简单的输出。

I've found several things on the internet (including a few questions on this site) and from what I found, I cobbled together a code like this: 我在互联网上找到了几样东西(包括本网站上的一些问题),从我发现的内容中,我整理了如下代码:

        #include <iostream>
        #include <omp.h>

        // various and sundry other stuff ...

        struct Node {
                int data;
                Node* next;
        };

        int main() {

            struct Node *newHead;
            struct Node *head = new Node;
            struct Node *currNode;
            int n;
            int tid;

            //create a bunch of Nodes in linked list with "data" ...

            // traverse the linked list:
            // examine data
            #pragma omp parallel private(tid)
            {
            currNode = head;
            tid=omp_get_thread_num();
            #pragma omp single
            {
            while (currNode) {
               #pragma omp task firstprivate(currNode)
               {
               cout << "Node data: " << currNode->data << " " << tid << "\n";
               } // end of pragma omp task
               currNode = currNode->next;
            } // end of while
            } //end of pragma omp single

            }  // end of pragma omp parallel


    // clean up etc. ...

    }  // end of main

So I run: 所以我跑:

>: export OMP_NUM_THREADS=6
>: g++ -fopenmp ll_code.cpp
>: ./a.out

And the output is: 输出为:

Node data: 5 0
Node data: 10 0
Node data: 20 0
Node data: 30 0
Node data: 35 0
Node data: 40 0
Node data: 45 0
Node data: 50 0
Node data: 55 0
Node data: 60 0
Node data: 65 0
Node data: 70 0
Node data: 75 0

So, tid is always 0. And that means, unless I'm really misunderstanding something, only one thread did anything with the linked list, and so the linked list was not traversed in parallel at all. 因此,tid始终为0。这意味着,除非我真的误解了什么,否则只有一个线程对链表进行了任何操作,因此链表完全没有并行遍历。

When I get rid of single , the code fails with a seg fault. 当我摆脱single ,代码会因seg错误而失败。 I have tried moving a few variables in and out of the OpenMP directive scopes, with no change. 我尝试将一些变量移入和移出OpenMP指令范围,而没有进行任何更改。 Changing the number of threads has no effect. 更改线程数无效。 How can this be made to work? 如何使它起作用?

A secondary question: Some sites say the firstprivate(currNode) is necessary and others say currNode is firstprivate by default. 次要问题:有些网站说firstprivate(currNode)是必要的和别人说currNodefirstprivate默认。 Who is right? 谁是对的?

You certainly can traverse a linked list using multiple threads, but it will be actually slower than just using a single thread. 您当然可以使用多个线程来遍历一个链表,但是它实际上比仅使用一个线程要慢。

The reason is that, to know the address of a node N != 0 , you must know the address of node N-1 . 原因是,要知道节点N != 0的地址,必须知道节点N-1的地址。

Assume now that you have N threads, each responsible for "starting at i position". 现在假设您有N线程,每个线程负责“从i位置开始”。 The above paragraph implies that a thread i will depend on the result of thread i-1 , which in turn will depend on the result of thread i-2 , and so on. 上面的段落暗示线程i将取决于线程i-1的结果,而线程i-1的结果将取决于线程i-2的结果,依此类推。

What you end up with is a serial traversal anyway. 无论如何,最终结果是串行遍历。 But now, instead of just a simple for , you have to synchronize threads too, making things inherently slower. 但现在,而不是只是一个简单for ,你必须得同步线程,使事情本身更慢。

But, if you're trying to do some heavy processing that would benefit from being run in parallel, then yes, you're going for the right approach. 但是,如果您要进行一些繁重的处理以从并行运行中受益,那么可以,您会选择正确的方法。 You can just change how you're getting the thread id: 您可以更改获取线程ID的方式:

#include <iostream>
#include <omp.h>

struct Node {
        int data;
        Node* next;
};

int main() {

    struct Node *head = new Node;
    struct Node *currNode = head;

    head->data = 0;
    for (int i=1;i<10;++i) {
        currNode->next = new Node;
        currNode = currNode->next;
        currNode->data = i;
    }

    // traverse the linked list:
    // examine data
    #pragma omp parallel
    {
        currNode = head;
        #pragma omp single
        {
            while (currNode) {
               #pragma omp task firstprivate(currNode)
               {
                   #pragma omp critical (cout)
                   std::cout << "Node data: " << currNode->data << " " << omp_get_thread_num() << "\n";
               }
               currNode = currNode->next;
            }
        }
    }
}

Possible output: 可能的输出:

Node data: 0 4
Node data: 6 4
Node data: 7 4
Node data: 8 4
Node data: 9 4
Node data: 1 3
Node data: 2 5
Node data: 3 2
Node data: 4 1
Node data: 5 0

See it live! 现场观看!

Finally, for a more idiomatic approach, consider using a std::forward_list : 最后,对于更惯用的方法,请考虑使用std :: forward_list

#include <forward_list>
#include <iostream>
#include <omp.h>

int main() {

    std::forward_list<int> list;
    for (int i=0;i<10;++i) list.push_front(i);

    #pragma omp parallel
    #pragma omp single
    for(auto data : list) {
       #pragma omp task firstprivate(data)
       #pragma omp critical (cout)
       std::cout << "Node data: " << data << " " << omp_get_thread_num() << "\n";
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM