简体   繁体   English

在 openMP 程序中嵌套 for 循环耗时太长

[英]Nested for loop in openMP program taking too long

I'm having a problem with parallelizing my program with openMP.我在使用 openMP 并行化我的程序时遇到问题。 The first for loop takes about 10 milliseconds, but the second takes about 45 seconds.第一个 for 循环大约需要 10 毫秒,但第二个循环大约需要 45 秒。 I'm not sure if I'm just doing something wrong in the loop that is wasting time.我不确定我是否只是在浪费时间的循环中做错了什么。

float A[M][M];
float B[M][M];
float C[M][M];

main(int argc, char** argv) {
float temp;
float real;
float error = 0;
int i,j,k;
double time_start;
double time_end;
double time_mid;
int n  = 12;

omp_set_num_threads(n);
time_start = omp_get_wtime();


#pragma omp parallel default(shared) private(i,j,k,temp,real) reduction(+:error)
#pragma omp for
for (i=0; i<M; i++) {
        for (j=0; j<M; j++) {
                A[i][j] = ((i+1)*(j+1))/(float)M;
                B[i][j] = (j+1)/(float)(i+1);
        }
}

time_mid = omp_get_wtime();
#pragma omp for
for (i=0; i<M; i++) {
        for (j=0; j<M; j++) {
                temp = 0;
                for (k=0; k<M; k++) {
                        temp += A[i][k]*B[k][j];
                }
            C[i][j] = temp;
            real =(float) (i+1)*(j+1);
            error = error + (float) fabs(temp-real)/real;

}
}


time_end = omp_get_wtime();
error = (100/(float)(M*M))*error;

printf("Percent error for C[][] is: %f\n", error);
printf("Time is: %f\n%f\n%f\n%f\n", time_end-time_start, time_start, time_mid, time_end);

return 0;
}

From OpenMP specifications (page 35, 2.1 Directive Format C/C++)来自 OpenMP 规范(第 35 页,2.1 指令格式 C/C++)
https://www.openmp.org/wp-content/uploads/openmp-4.5.pdf https://www.openmp.org/wp-content/uploads/openmp-4.5.pdf

An OpenMP executable directive applies to at most one succeeding statement, which must be a structured block. OpenMP 可执行指令最多适用于一个后续语句,该语句必须是结构化块。

The definition of a block in C++ is stmt.block C++中块的定义是stmt.block

Therefore #pragma omp parallel default(shared) private(i,j,k,temp,real) reduction(+:error) will only apply to the first block (your first for loop)因此#pragma omp parallel default(shared) private(i,j,k,temp,real) reduction(+:error)仅适用于第一个块(您的第一个 for 循环)

The other loops are not in a ' #pragma omp parallel ' context.其他循环不在“ #pragma omp parallel ”上下文中。

Use #pragma omp parallel{} to enclose the second loop.使用#pragma omp parallel{}封闭第二个循环。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM