简体   繁体   中英

MPI_Reduce w user function and non contiguous data

I'm trying to do some simple MPI Projects (w MPICH) but while doing so I experienced a problem I neither understand nor am able to solve (propably bc I misunderstand the doc). So what I basically wanted to do was to pass a struct to MPI_Reduce in order to make some operations on it and return the result back to the root process.

To do so I tried two different ways. First way was to use MPI_Pack to successively pack the struct's elements in a buffer and unpack them in my user function.

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>

#define MAX_PACKING_BUF_SIZE 100

typedef struct {
    double x;
    int n;
} BE_Type;

int rank = -1;

void operation(void *invec, void *inoutvec, int *length, MPI_Datatype *type)
{
    uint8_t *buf  = (uint8_t*) invec;
    double *res     = (double*) inoutvec;
    int pos;
    BE_Type value;

    printf("[%d] len: %d\n", rank, *length);

    pos = 0;
    MPI_Unpack(buf, *length, &pos, &value.x, 1, MPI_DOUBLE, MPI_COMM_WORLD);
    MPI_Unpack(buf, *length, &pos, &value.n, 1, MPI_INT,    MPI_COMM_WORLD);

    printf("[%d] x: %lf, n: %d\n", rank, value.x, value.n);

    *res    += value.x;
}

int main (int argc, char *argv[])
{
    int rc, pos;
    int root    = 0;
    double res  = 0.0;
    MPI_Op my_op;
    BE_Type value;
    uint8_t buf[MAX_PACKING_BUF_SIZE];

    rc = MPI_Init(&argc,&argv);
    if (rc != MPI_SUCCESS) printf("ERROR\n");
    rc = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    if (rc != MPI_SUCCESS) printf("ERROR\n");
    rc = MPI_Op_create( (MPI_User_function*) operation, 1, &my_op);
    if (rc != MPI_SUCCESS) printf("ERROR\n");

    value.x = 1.0;
    value.n = rank;

    pos = 0;
    MPI_Pack(&value.x, 1, MPI_DOUBLE, buf, MAX_PACKING_BUF_SIZE, &pos, MPI_COMM_WORLD);
    MPI_Pack(&value.n, 1, MPI_INT,    buf, MAX_PACKING_BUF_SIZE, &pos, MPI_COMM_WORLD);

    rc = MPI_Reduce(buf, &res, pos, MPI_PACKED, my_op, root, MPI_COMM_WORLD);
    if (rc != MPI_SUCCESS) printf("ERROR\n");

    if (rank == root) {
        printf("res: %lf\n", res);
    }
}

This code, however, leads to the following results (started 4 processes):

[0] len: 12
[2] len: 12
[2] x: 1.000000, n: 3
[0] x: 1.000000, n: 1
[0] len: 12
[0] x: 2.000000, n: 2
res: 4.000000

So first of all I'm wondering why my function is only called three times instead of four? And second of all (and that's my main question): Why is that x value altered at one point?

Interestingly that x value is altered as well using the second way I tested which was defining a new datatype:

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>

typedef struct {
    double x;
    int n;
} BE_Type;

int rank = -1;

void operation(void *invec, void *inoutvec, int *length, MPI_Datatype *type)
{
    BE_Type *value  = (BE_Type*) invec;
    double *res     = (double*) inoutvec;

    printf("[%d] x: %lf, n: %d\n", rank, value->x, value->n);

    *res    += value->x;
}

int main (int argc, char *argv[])
{
    int rc, pos;
    int root    = 0;
    double res  = 0.0;
    MPI_Op my_op;
    BE_Type value;

    MPI_Datatype MPI_BE_Type;
    int blocklens[2];
    MPI_Aint indices[2];
    MPI_Datatype old_types[2];

    rc = MPI_Init(&argc,&argv);
    if (rc != MPI_SUCCESS) printf("ERROR\n");
    rc = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    if (rc != MPI_SUCCESS) printf("ERROR\n");
    rc = MPI_Op_create( (MPI_User_function*) operation, 1, &my_op);
    if (rc != MPI_SUCCESS) printf("ERROR\n");

    blocklens[0] = 1;
    blocklens[1] = 1;
    old_types[0] = MPI_DOUBLE;
    old_types[1] = MPI_INT;
    MPI_Address(&value.x, &indices[0]);
    MPI_Address(&value.n, &indices[1]);
    indices[1] = indices[1] - indices[0];
    indices[0] = 0;
    MPI_Type_struct(2, blocklens, indices, old_types, &MPI_BE_Type);
    MPI_Type_commit(&MPI_BE_Type);

    value.x = 1.0;
    value.n = rank;

    rc = MPI_Reduce(&value, &res, 1, MPI_BE_Type, my_op, root, MPI_COMM_WORLD);
    if (rc != MPI_SUCCESS) printf("ERROR\n");

    if (rank == root) {
        printf("res: %lf\n", res);
    }
}

Results to (4 processes started):

[2] x: 1.000000, n: 3
[0] x: 1.000000, n: 1
[0] x: 2.000000, n: 2
res: 4.000000

So I guess I'm just misunderstanding something or using it wrongly. Every helped is appreciated. Thanks!

So first of all I'm wondering why my function is only called three times instead of four?

Adding up N values, takes N-1 additions. The same holds for any operation. No matter how you rearrange them.

And second of all (and that's my main question): Why is that x value altered at one point?

The operation is performed as a tree (typically a binomial tree). In your case it looks something like this:

Ranks   0 1 2 3

        1 1 1 1
        |/  | /
        +   +
        2   2
        |  /
        | /
        +
        4

Operations must always be associative, so this is a valid way to compute the result.

I wouldn't really recommend the first way of packing/unpacking in the custom reduction operation. Anyway: your reduction function must use both in , and inout as BE_Type ! The function also must work for any length . So it could look like:

void operation(void *invec, void *inoutvec, int *length, MPI_Datatype *type)
{
    BE_Type *value = (BE_Type*) invec;
    BE_Type *res   = (BE_Type*) inoutvec;

    for (int i = 0; i < *length; i++) {
        res[i].x += value[i].x;
        // should probably do something with n
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM