I want to test #pragma omp parallel for
and #pragma omp simd
for a simple matrix addition program. When I use each of them separately, I get no error and it seems fine. But, I want to test how much performance can be gained using both of them. If I use #pragma omp parallel for
before the outer loop and #pragma omp simd
before the inner loop I get no error as well. The error occures when I use both of them before the outer loop. I get an error at runtime not compile time. ICC
and GCC
return error but Clang
doesn't. It might be because Clang
regect the parallelization. In my experiments, Clang does not parallelize and run the program with only one thread.
The program is here:
#include <stdio.h>
//#include <x86intrin.h>
#define N 512
#define M N
int __attribute__(( aligned(32))) a[N][M],
__attribute__(( aligned(32))) b[N][M],
__attribute__(( aligned(32))) c_result[N][M];
int main()
{
int i, j;
#pragma omp parallel for
#pragma omp simd
for( i=0;i<N;i++){
for(j=0;j<M;j++){
c_result[i][j]= a[i][j] + b[i][j];
}
}
return 0;
}
The error for: ICC:
IMP1.c(20): error: omp directive is not followed by a parallelizable for loop #pragma omp parallel for ^
compilation aborted for IMP1.c (code 2)
GCC:
IMP1.c: In function 'main':
IMP1.c:21:10: error: for statement expected before '#pragma' #pragma omp simd
Because in my other testes pragma omp simd
for outer loop gets better performance I need to put that there (don't I?).
Platform: Intel Core i7 6700 HQ, Fedora 27
Tested compilers: ICC 18, GCC 7.2, Clang 5
Compiler command line:
icc -O3 -qopenmp -xHOST -no-vec
gcc -O3 -fopenmp -march=native -fno-tree-vectorize -fno-tree-slp-vectorize
clang -O3 -fopenmp=libgomp -march=native -fno-vectorize -fno-slp-vectorize
From OpenMP 4.5 Specification:
2.11.4 Parallel Loop SIMD Construct
The parallel loop SIMD construct is a shortcut for specifying a parallel construct containing one loop SIMD construct and no other statement.
The syntax of the parallel loop SIMD construct is as follows:
#pragma omp parallel for simd
...
You can also write:
#pragma omp parallel
{
#pragma omp for simd
for ...
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.