[英]finding global maxima of a function from comparing each processor's local maxima using MPI ring topology

I wish to use the MPI ring topology, passing each processor's maxima around the ring, comparing the local maxima and then output the global maxima for all processors. 我希望使用MPI环形拓扑,将每个处理器的最大值传递给环,比较局部最大值,然后输出所有处理器的全局最大值。 I am using a 10 dimensional Monte Carlo integration function. 我正在使用10维蒙特卡洛积分函数。 My first idea was to make an array with each processor's local maxima, then pass that value, compare and output the highest value. 我的第一个想法是使用每个处理器的局部最大值创建一个数组,然后传递该值,进行比较并输出最高值。 But I couldn't elegantly code to make an array which will take only each processors' max value and store it corresponding to rank of the processor, this way I can also keep track which processor got the global maxima. 但是我无法优雅地编写代码以仅将每个处理器的最大值存储在对应于处理器等级的数组中,这样我就可以跟踪哪个处理器获得了全局最大值。

I didn't finish my code yet, right now I am interested to see if an array with local maxima from processor's can be created. 我还没有完成我的代码,现在我很想看看是否可以创建一个具有处理器最大值的数组。 the way I coded, it's very time consuming and if there is a lot of processors, then I have to declare them each time, and yet I couldn't produce the array I am looking for. 按照我的编码方式,这非常耗时,并且如果有很多处理器,那么每次都必须声明它们,但是我无法生成我要寻找的数组。 I am sharing the code here: 我在这里共享代码:

#include <iostream>
#include <fstream>
#include <iomanip>
#include <cmath>
#include <cstdlib>
#include <ctime>
#include <mpi.h>
using namespace std;

//define multivariate function F(x1, x2, ...xk)            

double f(double x[], int n)
    double y;
    int j;
    y = 0.0;

    for (j = 0; j < n-1; j = j+1)
         y = y + exp(-pow((1-x[j]),2)-100*(pow((x[j+1] - pow(x[j],2)),2)));


    y = y;
    return y;

//define function for Monte Carlo Multidimensional integration

double int_mcnd(double(*fn)(double[],int),double a[], double b[], int n, int m)

    double r, x[n], v;
    int i, j;
    r = 0.0;
    v = 1.0;
    // initial seed value (use system time) 

    // step 1: calculate the common factor V
    for (j = 0; j < n; j = j+1)
         v = v*(b[j]-a[j]);

    // step 2: integration
    for (i = 1; i <= m; i=i+1)
        // calculate random x[] points
        for (j = 0; j < n; j = j+1)
            x[j] = a[j] +  (rand()) /( (RAND_MAX/(b[j]-a[j])));
        r = r + fn(x,n);
    r = r*v/m;

    return r;

double f(double[], int);
double int_mcnd(double(*)(double[],int), double[], double[], int, int); 

int main(int argc, char **argv)

    int rank, size;

    MPI_Init (&argc, &argv);      // initializes MPI
    MPI_Comm_rank (MPI_COMM_WORLD, &rank); // get current MPI-process ID. O, 1, ...
    MPI_Comm_size (MPI_COMM_WORLD, &size); // get the total number of processes

    /* define how many integrals */
    const int n = 10;       

    double b[n] = {5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0,5.0};                    
    double a[n] = {-5.0, -5.0, -5.0, -5.0, -5.0, -5.0, -5.0, -5.0, -5.0,-5.0};  

    double result, mean;
    int m;

    const unsigned int N = 5;
    double max = -1;
    double max_store[4];

    cout.setf(ios::fixed | ios::showpoint); 

    srand(time(NULL) * rank);  // each MPI process gets a unique seed

    m = 4;                // initial number of intervals

    // convert command-line input to N = number of points
    //N = atoi( argv[1] );

    for (unsigned int  i=0; i <=N; i++)
        result = int_mcnd(f, a, b, n, m);
        mean = result/(pow(10,10));

        if( mean > max) 
         max = mean;

        //cout << setw(10)  << m << setw(10) << max << setw(10) << mean << setw(10) << rank << setw(10) << size <<endl;
        m = m*4; 

    //cout << setw(30)  << m << setw(30) << result << setw(30) << mean <<endl; 
    printf("Process %d of %d mean = %1.5e\n and local max = %1.5e\n", rank, size, mean, max );
    if (rank==0)
         max_store[0] = max;
        else if (rank==1)
         max_store[1] = max;
        else if (rank ==2)
         max_store[2] = max;
        else if (rank ==3)
         max_store[3] = max;
    for( int k = 0; k < 4; k++ )
     printf( "%1.5e\n", max_store[k]);

    //double max_store[4] = {4.43095e-02, 5.76586e-02, 3.15962e-02, 4.23079e-02}; 

    double send_junk = max_store[0];
    double rec_junk;
    MPI_Status status;

  // This next if-statment implemeents the ring topology
  // the last process ID is size-1, so the ring topology is: 0->1, 1->2, ... size-1->0
  // rank 0 starts the chain of events by passing to rank 1
  if(rank==0) {
    // only the process with rank ID = 0 will be in this block of code.
    MPI_Send(&send_junk, 1, MPI_DOUBLE, 1, 0, MPI_COMM_WORLD); //  send data to process 1
    MPI_Recv(&rec_junk, 1, MPI_DOUBLE, size-1, 0, MPI_COMM_WORLD, &status); // receive data from process size-1
  else if( rank == size-1) { 
    MPI_Recv(&rec_junk, 1, MPI_DOUBLE, rank-1, 0, MPI_COMM_WORLD, &status); // recieve data from process rank-1 (it "left" neighbor")
    MPI_Send(&send_junk, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD); // send data to its "right neighbor", rank 0
  else {
    MPI_Recv(&rec_junk, 1, MPI_DOUBLE, rank-1, 0, MPI_COMM_WORLD, &status); // recieve data from process rank-1 (it "left" neighbor")
    MPI_Send(&send_junk, 1, MPI_DOUBLE, rank+1, 0, MPI_COMM_WORLD); // send data to its "right neighbor" (rank+1)
  printf("Process %d send %1.5e\n and recieved %1.5e\n", rank, send_junk, rec_junk ); 

  MPI_Finalize(); // programs should always perform a "graceful" shutdown
    return 0;

compile with : 编译:

mpiCC -o gd test_code.cpp
 mpirun -np 4 ./gd

I would appreciate suggestion: 我将不胜感激的建议:

  1. if there is a more elegant way to make local maxima arrays? 是否有更优雅的方法来制作局部最大值数组?
  2. How to compare the local maxima and decide the global maxima while passing the values in a ring? 如何比较局部最大值并决定全局最大值,同时将值传递给环?

Also feel free to modify the code to provide me a better example to work with. 也可以随意修改代码,以便为我提供一个更好的示例。 I would appreciate any suggestion. 我将不胜感激任何建议。 thanks. 谢谢。

For this sort of thing, better using either MPI_Reduce() or MPI_Allreduce() with MPI_MAX as operator. 对于这种情况,最好将MPI_Reduce()MPI_Allreduce()MPI_MAX用作运算符。 The former will compute the max over the values exposed by all processes and give the result to the "root" process only, while the later will do the same, but give the results to all processes. 前者将计算所有进程公开的值的最大值,并将结果仅提供给“根”进程,而后者将执行相同的操作,但将结果提供给所有进程。

// Only process of rank 0 get the global max
MPI_Reduce( &local_max, &global_max, 1, MPI_DOUBLE, MPI_MAX, 0, MPI_COMM_WORLD );
// All processes get the global max
MPI_Allreduce( &local_max, &global_max, 1, MPI_DOUBLE, MPI_MAX, MPI_COMM_WORLD );
// All processes get the global max, stored in place of the local max
// after the call ends - this might be the most interesting one for you

As you can see, you could just insert the 3rd example into your code to solve your problem. 如您所见,您只需在代码中插入第三个示例即可解决您的问题。

BTW, unrelated remark, but this hurts my eyes: 顺便说一句,无关的话,但这伤了我的眼睛:

if (rank==0)
     max_store[0] = max;
    else if (rank==1)
     max_store[1] = max;
    else if (rank ==2)
     max_store[2] = max;
    else if (rank ==3)
     max_store[3] = max;

What about something like this: 像这样的事情呢:

if ( rank < 4 && rank >= 0 ) {
    max_store[rank] = max;

