简体   繁体   English

谐波级数和 C++ MPI 和 OpenMP

[英]Harmonic progression sum c++ MPI and OpenMP

I'm trying to make a parallel version of "Harmonic Progression Sum" problem using MPI and opemMP together.我正在尝试使用 MPI 和 opemMP 来制作“谐波渐进和”问题的并行版本。 But the output are differents each other process.但输出是不同的彼此过程。

Could someone help me to finish this problem?有人可以帮我解决这个问题吗?

Parallel Program: (MPI and OpenMP)并行程序: (MPI 和 OpenMP)

#include <stdio.h>
#include <stdlib.h>
#include <iostream>
#include <sstream>
#include <time.h>
#include <omp.h>
#include <mpi.h>

#define d 10    //Numbers of Digits (Example: 5 => 0,xxxxx)
#define n 1000  //Value of N (Example: 5 => 1/1 + 1/2 + 1/3 + 1/4 + 1/5)

using namespace std;

double t_ini, t_fim, t_tot;

int getProcessId(){
    int rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    return rank;
}

int numberProcess(){
    int numProc;
    MPI_Comm_size(MPI_COMM_WORLD, &numProc);
    return numProc;
}

void reduce(long unsigned int digits1 [])
{
    long unsigned int digits2[d + 11];
    int i = 0;
    for(i = 0; i < d + 11; i++) digits2[i] = 0;

    MPI_Allreduce(digits1, digits2,(d+11),MPI_INT,MPI_SUM,MPI_COMM_WORLD);

    for(i = 0; i < d + 11; i++) digits1[i] = digits2[i];

}

void slave(long unsigned int *digits)
{
    int idP = getProcessId(), numP = numberProcess();

    int i;
    long unsigned int digit;
    long unsigned int remainder;

    #pragma omp parallel for private(i, remainder, digit)
    for (i = idP+1; i <= n; i+=numP){
        remainder = 1;
        for (digit = 0; digit < d + 11 && remainder; ++digit) {
            long unsigned int div = remainder / i;
            long unsigned int mod = remainder % i;
            #pragma omp atomic
            digits[digit] += div;
            remainder = mod * 10;
        }
    }
}

void HPS(char* output) {
    long unsigned int digits[d + 11];

    for (int digit = 0; digit < d + 11; ++digit)
        digits[digit] = 0;

    reduce(digits);
    slave(digits);

    for (int i = d + 11 - 1; i > 0; --i) {
        digits[i - 1] += digits[i] / 10;
        digits[i] %= 10;
    }

    if (digits[d + 1] >= 5) ++digits[d];


    for (int i = d; i > 0; --i) {
        digits[i - 1] += digits[i] / 10;
        digits[i] %= 10;
    }
    stringstream stringstreamA;
    stringstreamA << digits[0] << ",";


    for (int i = 1; i <= d; ++i) stringstreamA << digits[i];

    string stringA = stringstreamA.str();
    stringA.copy(output, stringA.size());
}

int main(int argc, char **argv) {
    MPI_Init(&argc,&argv);

    t_ini = clock();

    //Parallel MPI com OpenMP Method
    cout << "Parallel MPI com OpenMP Method: " << endl;
    char output[d + 10];
    HPS(output);

    t_fim = clock();
    t_tot = t_fim-t_ini;

    cout << "Parallel MPI with OpenMP Method: " << (t_tot / 1000) << endl;
    cout << output << endl;

    MPI_Finalize();

    system("PAUSE");
    return 0;
}

Examples:例子:

Input:输入:

#define d 10
#define n 1000

Output:输出:

7,4854708606

Input:输入:

#define d 12
#define n 7

Output:输出:

2,592857142857

You have a mistake here :你在这里有一个错误:

void HPS(char* output) {
    ...
    reduce(digits);
    slave(digits);

    ...
}

You should first compute and then perform the reduction not the other way around.您应该首先计算,然后执行缩减,而不是相反。 Change to:改成:

void HPS(char* output) {
    ...

    slave(digits);
    reduce(digits);
    ...
}

Since you want to used MPI + OpenMP, you can also leave this:既然想用MPI+OpenMP,也可以留下这个:

for (i = idP+1; i <= n; i+=numP)

to be divide among processes.在进程之间进行划分。 And the inside loop divide among the threads:内部循环在线程之间划分:

 #pragma omp parallel for private(remainder)
 for (digit = 0; digit < d + 11 && remainder; ++digit) 

thus having something like this:因此有这样的事情:

    for (i = idP+1; i <= n; i+=numP){
        remainder = 1;
        #pragma omp parallel for private(i, remainder, digit)
        for (digit = 0; digit < d + 11 && remainder; ++digit) {
            long unsigned int div = remainder / i;
            long unsigned int mod = remainder % i;
            #pragma omp atomic
            digits[digit] += div;
            remainder = mod * 10;
        }
    }

You can also, if you prefer (it is similar to what you did), divide the number of work of the outer loop through all the parallel task (threads/process), like this:如果您愿意(这与您所做的类似),您也可以将外循环的工作数量除以所有并行任务(线程/进程),如下所示:

int idT = omp_get_thread_num();      // Get the thread id
int numT = omp_get_num_threads();    // Get the number of threads.
int numParallelTask = numT * numP;   // Number of parallel task
int start = (idP+1) + (idT*numParallelTask); // The first position here each thread will work

#pragma omp parallel
{

for (i = start; i <= n; i+=numParallelTask)

...
}

Note that I am not saying this will give you the best performance, but it is a start.请注意,我并不是说这将为您提供最佳性能,但这是一个开始。 After you got your algorithm properly working in MPI+OpenMP you can proceed to more sophisticated approaches.在您的算法在 MPI+OpenMP 中正常工作后,您可以继续使用更复杂的方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM