简体   繁体   English

带有加速功能的示例openmp程序

[英]A sample openmp program with speedup

Could someone provide an OpenMP program where the speedup is visible compared to without it. 有人可以提供OpenMP程序,与没有该程序相比,加速效果是可见的。 I'm finding it extremely difficult to achieve speedup. 我发现要实现加速非常困难。 Even this simple program runs slower with OpenMP. 即使使用OpenMP,此简单程序运行速度也较慢。 My processor is Intel® Core™ i3-2370M CPU @ 2.40GHz × 4 running on Linux (Ubuntu 14.10) 我的处理器是在Linux(Ubuntu 14.10)上运行的Intel®Core™i3-2370M CPU @ 2.40GHz×4

#include <cmath>
#include <stdio.h>
#include <time.h> 
int main() {
   clock_t t;
   t = clock();
   const int size = 4;
   long long int k;

    #pragma omp parallel for num_threads(4)
    for(int n=0; n<size; ++n) {
       for(int j=0;j<100000000;j++){ 
       }
       printf("\n");
    }

    t = clock() - t;
    printf ("It took me %d clicks (%f seconds).\n",t,((float)t)/CLOCKS_PER_SEC);

    return 0;
}

Calculating a integral is a classical one, adjust the parts constant to increase the execution time and see more clearly the runtime, more parts, more execution time. 计算积分是一种经典的方法,调整零件常数以增加执行时间,并更清楚地看到运行时间,更多零件,更多执行时间。 It's getting 21.3 seconds with OpenMP enabled and 26.7 seconds, on a SINGLE core, DUAL thread Intel pentium 4: 在单核双线程Intel pentium 4上,启用OpenMP的时间为21.3秒,启用时为26.7秒:

#include <math.h>
#include <stdio.h>
#include <omp.h>

#define from 0.0f
#define to 2.0f
#define parts 999999999
#define step ((to - from) / parts)
#define x (from + (step / 2.0f))

int main()
{
        double integralSum = 0;
        int i;
        #pragma omp parallel for reduction(+:integralSum)
        for (i = 1; i < (parts+1); ++i)
        {
                integralSum = integralSum + (step * fabs(pow((x + (step * i)),2) + 4));
        }

        printf("%f\n", integralSum);

        return 0;
}

It calculates the definite integral from 0 to 2 of x^2 + 4 计算x ^ 2 + 4的从0到2的定积分

I had a problem related to this, where I wanted to find the max value of an array. 我有一个与此相关的问题,我想在其中找到数组的最大值。 I made the same mistake as you, I used clock for measuring the elapsed time. 我犯了与您相同的错误,我使用时钟来测量经过时间。 To fix this, I used clock_gettime() instead, and now it works. 为了解决这个问题,我改用clock_gettime(),现在可以使用了。

As for an example code where the speedup is measurable (Note you migth want to change the value of N): 作为示例代码,其中加速是可测量的(注意,migth想要更改N的值):

#include <omp.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <time.h>

struct timespec diff(struct timespec start, struct timespec end)
{
struct timespec temp;

if(end.tv_sec - start.tv_sec == 0)
{
    temp.tv_nsec = end.tv_nsec - start.tv_nsec;
}
else
{
    temp.tv_nsec = ((end.tv_sec - start.tv_sec)*1000000000) + end.tv_nsec - start.tv_nsec;
}

return temp;
}

int main()
{
unsigned int N;
struct timespec t_start, t_end;
clock_t start, end;

srand(time(NULL));

FILE *f = fopen("out.txt", "w");
if(f == NULL)
{
    printf("Could not open output\n");
    return -1;
}

for(N = 1000000; N < 100000000; N += 1000000)
{
    fprintf(f, "%d\t", N);
    int* array = (int*)malloc(sizeof(int)*N);
    if(array == NULL)
    {
        printf("Not enough space\n");
        return -1;
    }
    for(unsigned int i = 0; i<N; i++) array[i] = rand();

    int max_val = 0.0;

    clock_gettime(CLOCK_MONOTONIC, &t_start);

    #pragma omp parallel for reduction(max:max_val)
    for(unsigned int i=0; i<N; i++)
    {
        if(array[i] > max_val) max_val = array[i];
    }

    clock_gettime(CLOCK_MONOTONIC, &t_end);

    fprintf(f, "%lf\t", (double)(diff(t_start, t_end).tv_nsec / 1000000000.0));

    max_val = 0.0;

    clock_gettime(CLOCK_MONOTONIC, &t_start);
    for(unsigned int i = 0; i<N; i++)
    {
        if(array[i] > max_val) max_val = array[i];
    }
    clock_gettime(CLOCK_MONOTONIC, &t_end);

    fprintf(f, "%lf\n", (double)(diff(t_start, t_end).tv_nsec / 1000000000.0));

    free(array);
}

fclose(f);

return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM