简体   繁体   中英

Multiplying matrix openMP is slower than sequential

I am new to C and have created a program that creates two arrays and then multiplies them using openMP. When I compare them the sequential is quicker than the openMP way.

#include <stdio.h>
#include <stdio.h>
#include <omp.h>
#include <time.h>
#define SIZE 1000

int arrayOne[SIZE][SIZE];
int arrayTwo[SIZE][SIZE];
int arrayThree[SIZE][SIZE];

int main()
{
  int i=0, j=0, k=0, sum = 0;

  //creation of the first array
  for(i = 0; i < SIZE; i++){
    for(j = 0; j < SIZE; j++){
      arrayOne[i][j] = 2;
      /*printf("%d \t", arrayOne[i][j]);*/
    }
  }

  //creation of the second array
  for(i = 0; i < SIZE; i++){
    for(j = 0; j < SIZE; j++){
      arrayTwo[i][j] = 3;
      /*printf("%d \t", arrayTwo[i][j]);*/
    }
  }

  clock_t begin = clock();
  //Matrix Multiplication (No use of openMP)
  for (i = 0; i < SIZE; ++i) {
      for (j = 0; j < SIZE; ++j) {
          for (k = 0; k < SIZE; ++k) {
              sum = sum + arrayOne[i][k] * arrayTwo[k][j];
          }
          arrayThree[i][j] = sum;
          sum = 0;
      }
  }
  clock_t end = clock();
  double time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
  printf("Time taken without openMp: %f  \n", time_spent);

  //Matrix Multiplication Using openMP
  printf("---------------------\n");
  clock_t new_begin = clock();
  #pragma omp parallel private(i, j, sum, k) shared (arrayOne, arrayTwo, arrayThree)
  {
    #pragma omp for schedule(static)
    for (i = 0; i < SIZE; i++) {
      for(j = 0; j < SIZE; j++) {
        for(k = 0; k < SIZE; k++) {
            sum = sum + arrayOne[i][k] * arrayTwo[k][j];
          }
          arrayThree[i][j] = sum;
          sum = 0;
      }
    }
  }
  clock_t new_end = clock();
  double new_time_spent = (double)(new_end - new_begin) / CLOCKS_PER_SEC;
  printf("Time taken WITH openMp: %f  ", new_time_spent);

  return 0;
}

The sequential way takes 0.265000 while the openMP takes 0.563000. I have no idea why, any solutions?

Updated code to global arrays and make them larger but still takes double the run time.

OpenMP needs to create and destroy threads, which results in extra overhead. Such overhead is quite small, but for a very small workload, like yours, it can still be significant.

to use OpenMP more efficiently, you should give it a large workload (larger matrices) and make thread-related overhead not the dominant factor in your runtime.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM