简体   繁体   中英

OpenMP sections run slower than single thread

Why is the parallel application taking more time to execute than the one with the single thread? I am using an 8 CPU computer with Ubuntu 14.04. The code is just my simple way to test omp parallel sections, the aim later is to run two different functions in two different threads at the same time, so I do not want to use #pragma omp parallel for.

parallel:

int main()
{
int k = 0;
int m = 0;

omp_set_num_threads(2);

#pragma omp parallel
{
    #pragma omp sections
    {
        #pragma omp section
        {
            for( k = 0; k < 1e9; k++ ){};
        }

        #pragma omp section
        {
            for( m = 0; m < 1e9; m++ ){};
        }
    }
}

return 0;

}

and the single thread:

int main()
{
int m = 0;
int k = 0;

for( k = 0; k < 1e9; k++ ){};


for( m = 0; m < 1e9; m++ ){};

return 0;

}

If the compiler would not optimise the loops, then the parallel code would suffer from false sharing because m and k are very likely to end up in the same cache line. Make the variables private :

#pragma omp parallel private(k,m)
{
    #pragma omp sections
    {
        #pragma omp section
        {
            for( k = 0; k < 1e9; k++ ){};
        }

        #pragma omp section
        {
            for( m = 0; m < 1e9; m++ ){};
        }
    }
}

At high optimisation levels, the compiler could drop the loops altogether. But then the parallel version will still have the added overhead of spawning the OpenMP worker threads and joining them afterwards, which will make it slower than the sequential version.

In above test code compiler itself optimizing the code. You need to change your test code. Depending on number of thread you are creating also add an overhead. Also refer, Amdahl's Law.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM