Multiplying matrix openMP is slower than sequential

Question

I am new to C and have created a program that creates two arrays and then multiplies them using openMP. When I compare them the sequential is quicker than the openMP way.

#include <stdio.h>
#include <stdio.h>
#include <omp.h>
#include <time.h>
#define SIZE 1000

int arrayOne[SIZE][SIZE];
int arrayTwo[SIZE][SIZE];
int arrayThree[SIZE][SIZE];

int main()
{
  int i=0, j=0, k=0, sum = 0;

  //creation of the first array
  for(i = 0; i < SIZE; i++){
    for(j = 0; j < SIZE; j++){
      arrayOne[i][j] = 2;
      /*printf("%d \t", arrayOne[i][j]);*/
    }
  }

  //creation of the second array
  for(i = 0; i < SIZE; i++){
    for(j = 0; j < SIZE; j++){
      arrayTwo[i][j] = 3;
      /*printf("%d \t", arrayTwo[i][j]);*/
    }
  }

  clock_t begin = clock();
  //Matrix Multiplication (No use of openMP)
  for (i = 0; i < SIZE; ++i) {
      for (j = 0; j < SIZE; ++j) {
          for (k = 0; k < SIZE; ++k) {
              sum = sum + arrayOne[i][k] * arrayTwo[k][j];
          }
          arrayThree[i][j] = sum;
          sum = 0;
      }
  }
  clock_t end = clock();
  double time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
  printf("Time taken without openMp: %f  \n", time_spent);

  //Matrix Multiplication Using openMP
  printf("---------------------\n");
  clock_t new_begin = clock();
  #pragma omp parallel private(i, j, sum, k) shared (arrayOne, arrayTwo, arrayThree)
  {
    #pragma omp for schedule(static)
    for (i = 0; i < SIZE; i++) {
      for(j = 0; j < SIZE; j++) {
        for(k = 0; k < SIZE; k++) {
            sum = sum + arrayOne[i][k] * arrayTwo[k][j];
          }
          arrayThree[i][j] = sum;
          sum = 0;
      }
    }
  }
  clock_t new_end = clock();
  double new_time_spent = (double)(new_end - new_begin) / CLOCKS_PER_SEC;
  printf("Time taken WITH openMp: %f  ", new_time_spent);

  return 0;
}

The sequential way takes 0.265000 while the openMP takes 0.563000. I have no idea why, any solutions?

Updated code to global arrays and make them larger but still takes double the run time.

Answer 1

OpenMP needs to create and destroy threads, which results in extra overhead. Such overhead is quite small, but for a very small workload, like yours, it can still be significant.

to use OpenMP more efficiently, you should give it a large workload (larger matrices) and make thread-related overhead not the dominant factor in your runtime.

Multiplying matrix openMP is slower than sequential

Question

1 answers

solution1
1 2020-04-19 15:17:45

Multiplying matrix openMP is slower than sequential

Question

1 answers

solution1 1 2020-04-19 15:17:45

solution1
1 2020-04-19 15:17:45