简体   繁体   English

使用OpenMP库,执行时间如何取决于线程数量的增加?

[英]How time executing depends of increasing of number of threads with using OpenMP library?

Increasing number of threads increase the time of loop execution rather decrease it. 线程数量的增加会增加循环执行的时间,而会减少执行时间。

#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
#include <limits.h>
#define n 4

int main(int argc, char **argv)
{
    FILE * file1 = fopen("output.txt", "w");
    if (file1 == NULL){
        exit(EXIT_FAILURE);
    }

    srand(time(NULL));
    int matrix[n][n];
    int i, j;
    for(i = 0; i < n; i++){
        for (j = 0; j < n; j++){
            matrix[i][j] = rand() % 100 + 1;
            fprintf(file1, "%d ", matrix[i][j]);
        }
        fprintf(file1, "\n");
    }
    int sum = 0;
    int min;
    double start;
    double end;

Starting cout the time of loop 启动循环的时间

    start = omp_get_wtime();

// in num_threads I've changed the number of threads 
// and investigate a problem of increasing the time

#pragma omp parallel for private (i, j, min) reduction(+:sum)       num_threads(4) 
        for(i = 0; i < n; i++){
            min = INT_MAX;
            for (j = 0; j < n; j++){
                if(matrix[j][i] < min){
                    min = matrix[j][i];
                    }
            }
            sum += min; // sum of min numbers of each column
        }
end = omp_get_wtime();

printf("Time: %lf\n", end - start);

printf("Min sum of matrix = %d", sum);
fclose(file1);
return 0;
}

4 threads 4个线程
Time: 0.000930 时间:0.000930
3 threads 3线程
Time: 0.000356 时间:0.000356
2 threads 2线程
Time: 0.000533 时间:0.000533
1 thread 1个线程
Time: 0.000008 时间:0.000008

My CPU has 4 threads. 我的CPU有4个线程。

You have a very small problem (4x4) and you are timing thread creation. 您有一个非常小的问题(4x4),并且正在定时创建线程。 I don't expect the parallelism to help much at this scale anyway (since just the cost of waking the threads and then synchronizing them again at the end of the parallel will be hugely larger than the work you are trying to do), but you can remove the cost of creating the thread pool from your measurement by adding a 我认为并行性无论如何都不会在这种规模上有太大帮助(因为仅唤醒线程然后在并行结束时再次同步它们的开销将比您尝试做的工作大得多),但是您可以通过添加以下内容从度量中消除创建线程池的成本:

#pragma omp parallel ;

before the timed region. 在定时区域之前。

And, please, please, please, don't force the number of threads. 并且,请,请,请不要强迫线程数。 Use the OMP_NUM_THREADS envirable. 使用OMP_NUM_THREADS环境。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM