简体   繁体   English

使用openmp并行化C代码

[英]parallelizing c code using openmp

I am trying to make the following program parallel using OpenMP : 我正在尝试使用OpenMP并行执行以下程序:

#include <time.h>

// Program computes the total number of primes larger than 100000001 and smaller than 16000001.
main() {

int number = 100000001;
int primes[20];
int i, j, is_prime, index = 0, nprimes = 0;
time_t start_time, end_time;

start_time = time(NULL);
for (i = 0; i < 3000000; i++) {
    // get the next number to check if it is a prime
    number += 2;
    is_prime = 1;
    for (j = 2; j < 10001; j++) {
        if ((number % j) == 0) {
            is_prime = 0;
            break;
        }
    }
    // f0und a prime number. Count it and save the first 20 primes
    if (is_prime) nprimes++;
    if (is_prime && (index < 20)) {
        primes[index] = number;
        index++;
    }
}
for (i = 0; i < 20; i++)
    printf("%d is prime\n", primes[i]);
end_time = time(NULL);
printf("number of primes = %d, elapsed time is %d seconds\n", nprimes, end_time - start_time);
}

What I have done is this: 我所做的是这样的:

#include <stdio.h>
#include <time.h>
#include <omp.h>
#define CHUNKSIZE 750000
//#define CHUNKSIZE2 2500

// Program computes the total number of primes larger than 100000001 and smaller than 16000001.
int main() {

int number = 100000001;
int primes[20];
int i, j, is_prime, index = 0, nprimes = 0;
time_t start_time, end_time;

start_time = time(NULL);
int chunk = CHUNKSIZE;
//int chunk2 = CHUNKSIZE2;
#pragma omp parallel shared(number, index, nprimes, chunk) private(i, j, is_prime)
{
#pragma omp parallel for schedule (dynamic, chunk)
for (i = 0; i < 3000000; i++) {
    // get the next number to check if it is a prime
    number += 2;
    is_prime = 1;
    //#pragma omp parallel for schedule (dynamic, chunk2)
    for (j = 2; j < 10001; j++) {
        if ((number % j) == 0) {
            is_prime = 0;
            break;
        }
    }
    // f0und a prime number. Count it and save the first 20 primes
    if (is_prime) nprimes++;
    if (is_prime && (index < 20)) {
        primes[index] = number;
        index++;
    }
}

 for (i = 0; i < 20; i++)
    printf("%d is prime\n", primes[i]);
    end_time = time(NULL);
    printf("number of primes = %d, elapsed time is %d seconds\n", nprimes, end_time - start_time);
 //return 0;
}

I have tried many things but most of them gave me longer or same time !!! 我尝试了很多东西,但大多数都给了我更长或更长时间!

The number variable is incremented globally and hence creates a barrier; number变量在全局范围内递增,因此会形成障碍; no computation can be done in parallel, each thread must wait for the previous one to end so that the number+=2 part is consistent. 不能并行执行任何计算,每个线程必须等待前一个线程结束,以便number+=2部分保持一致。

You can circumvent this by creating another, thread-specific variable (here n ) whose value is based on the loop index ( i ) 您可以通过创建另一个基于线程的变量(此处为n )来避免这种情况,该变量的值基于循环索引( i

One pragma omp parallel for is sufficient: 一个并行的编译指示就足够了:

#include <stdio.h>
#include <time.h>
#include <omp.h>
#define CHUNKSIZE 750000
//#define CHUNKSIZE2 2500

// Program computes the total number of primes larger than 100000001 and smaller than 16000001.
int main() {

int number = 100000001;
int n;
int primes[20];
int i, j, is_prime, index = 0, nprimes = 0;
time_t start_time, end_time;

start_time = time(NULL);
int chunk = CHUNKSIZE;
//int chunk2 = CHUNKSIZE2;

#pragma omp parallel for private(n, is_prime, j)
for (i = 0; i < 300000; i++) {
    // get the next number to check if it is a prime
    //number += 2;
    n = number + i*2;
    is_prime = 1;
    //#pragma omp parallel for schedule (dynamic, chunk2)
    for (j = 2; j < 10001; j++) {
        if ((n % j) == 0) {
            is_prime = 0;
            break;
        }
    }
    // f0und a prime number. Count it and save the first 20 primes
    if (is_prime) nprimes++;
    if (is_prime && (index < 20)) {
        primes[index] = n;
        index++;
    }
}

 for (i = 0; i < 20; i++)
    printf("%d is prime\n", primes[i]);
    end_time = time(NULL);
    printf("number of primes = %d, elapsed time is %d seconds\n", nprimes, end_time - start_time);
 //return 0;

}

Result with gcc and a trimmed-down computations to avoid waiting too much: gcc和精简计算的结果,以避免等待过多:

$ gcc -fopenmp -o tt tt.c
$ time OMP_NUM_THREADS=1  ./tt
100000007 is prime
[...]
100000393 is prime
number of primes = 326390, elapsed time is 21 seconds

real    0m20.507s
user    0m20.492s
sys 0m0.001s
$ time OMP_NUM_THREADS=8  ./tt
101500027 is prime
[...]
105250049 is prime
number of primes = 325580, elapsed time is 3 seconds
real    0m3.041s
user    0m24.284s
sys 0m0.002s

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM