简体   繁体   English

使用mpi将矩阵写入单个txt文件

[英]writing a matrix into a single txt file with mpi

I have a huge matrix that I divided it into some sub matrices and I make some computation on it. 我有一个巨大的矩阵,我把它分成了一些子矩阵,我对它做了一些计算。 After those computations I have to write that matrix into a single file for post processing. 在那些计算之后,我必须将该矩阵写入单个文件中以进行后处理。 Is it possible to write results into a single text file and how can I do it? 是否可以将结果写入单个文本文件中,我该怎么做? For example we have a nx ny matrix that is divided in y direction (each processes has a nx rank matrix) and we want to write the nx*ny matrix into a single text file. 例如,我们有一个在y方向上划分的nx ny矩阵(每个进程都有一个nx秩矩阵),我们希望将nx * ny矩阵写入单个文本文件中。

So it's not a good idea to write large amounts of data as text. 因此,将大量数据写为文本并不是一个好主意。 It's really, really, slow, it generates unnecessarily large files, and it's a pain to deal with. 它真的,真的,慢,它会产生不必要的大文件,而且处理起来很痛苦。 Large amounts of data should be written as binary, with only summary data for humans written as text. 大量数据应该写成二进制文件,只有人类的摘要数据写成文本。 Make the stuff the computer is going to deal with easy for the computer, and only the stuff you're actually going to sit down and read easy for you to deal with (eg, text). 制作计算机将要处理的东西对于计算机来说很简单,只有你实际上要坐下来阅读的东西很容易让你处理(例如,文本)。

Whether you're going to write as text or binary, you can use MPI-IO to coordinate your output to the file to generate one large file. 无论您是以文本还是二进制编写,都可以使用MPI-IO协调输出到文件以生成一个大文件。 We have a little tutorial on the topic (using MPI-IO, HDF5, and NetCDF) here . 我们有关于这一主题的小教程(使用MPI-IO,HDF5,并创建NetCDF) 这里 For MPI-IO, the trick is to define a type (here, a subarray) to describe the local layout of data in terms of the global layout of the file, and then write to the file using that as the "view". 对于MPI-IO,诀窍是定义一个类型(这里是一个子数组)来根据文件的全局布局描述数据的本地布局,然后使用它作为“视图”写入文件。 Each file sees only its own view, and the MPI-IO library coordinates the output so that as long as the views are non-overlapping, everything comes out as one big file. 每个文件只能看到自己的视图,而MPI-IO库会协调输出,这样只要视图不重叠,一切就会变成一个大文件。

If we were writing this out in binary, we'd just point MPI_Write to our data and be done with it; 如果我们用二进制编写这个,我们只需将MPI_Write指向我们的数据并完成它; since we're using text, we have to convert out data into a string. 因为我们正在使用文本,所以我们必须将数据转换为字符串。 We define our array the way we normally would have, except instead of it being of MPI_FLOATs, it's of a new type which is charspernum characters per number. 我们按照通常的方式定义数组,除了它不是MPI_FLOAT之外,它是一个新类型,每个数字charspernum字符数字。

The code follows: 代码如下:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <mpi.h>

float **alloc2d(int n, int m) {
    float *data = malloc(n*m*sizeof(float));
    float **array = malloc(n*sizeof(float *));
    for (int i=0; i<n; i++)
        array[i] = &(data[i*m]);
    return array;
}

int main(int argc, char **argv) {
    int ierr, rank, size;
    MPI_Offset offset;
    MPI_File   file;
    MPI_Status status;
    MPI_Datatype num_as_string;
    MPI_Datatype localarray;
    const int nrows=10;
    const int ncols=10;
    float **data;
    char *const fmt="%8.3f ";
    char *const endfmt="%8.3f\n";
    int startrow, endrow, locnrows;

    const int charspernum=9;

    ierr = MPI_Init(&argc, &argv);
    ierr|= MPI_Comm_size(MPI_COMM_WORLD, &size);
    ierr|= MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    locnrows = nrows/size;
    startrow = rank * locnrows;
    endrow = startrow + locnrows - 1;
    if (rank == size-1) {
        endrow = nrows - 1;
        locnrows = endrow - startrow + 1;
    }

    /* allocate local data */
    data = alloc2d(locnrows, ncols);

    /* fill local data */
    for (int i=0; i<locnrows; i++) 
        for (int j=0; j<ncols; j++)
            data[i][j] = rank;

    /* each number is represented by charspernum chars */
    MPI_Type_contiguous(charspernum, MPI_CHAR, &num_as_string); 
    MPI_Type_commit(&num_as_string); 

    /* convert our data into txt */
    char *data_as_txt = malloc(locnrows*ncols*charspernum*sizeof(char));
    int count = 0;
    for (int i=0; i<locnrows; i++) {
        for (int j=0; j<ncols-1; j++) {
            sprintf(&data_as_txt[count*charspernum], fmt, data[i][j]);
            count++;
        }
        sprintf(&data_as_txt[count*charspernum], endfmt, data[i][ncols-1]);
        count++;
    }

    printf("%d: %s\n", rank, data_as_txt);

    /* create a type describing our piece of the array */
    int globalsizes[2] = {nrows, ncols};
    int localsizes [2] = {locnrows, ncols};
    int starts[2]      = {startrow, 0};
    int order          = MPI_ORDER_C;

    MPI_Type_create_subarray(2, globalsizes, localsizes, starts, order, num_as_string, &localarray);
    MPI_Type_commit(&localarray);

    /* open the file, and set the view */
    MPI_File_open(MPI_COMM_WORLD, "all-data.txt", 
                  MPI_MODE_CREATE|MPI_MODE_WRONLY,
                  MPI_INFO_NULL, &file);

    MPI_File_set_view(file, 0,  MPI_CHAR, localarray, 
                           "native", MPI_INFO_NULL);

    MPI_File_write_all(file, data_as_txt, locnrows*ncols, num_as_string, &status);
    MPI_File_close(&file);

    MPI_Type_free(&localarray);
    MPI_Type_free(&num_as_string);

    free(data[0]);
    free(data);

    MPI_Finalize();
    return 0;
}

Running gives: 跑步给出:

$ mpicc -o matrixastxt matrixastxt.c  -std=c99
$ mpirun -np 4 ./matrixastxt
$ more all-data.txt 
   0.000    0.000    0.000    0.000    0.000    0.000    0.000    0.000    0.000    0.000
   0.000    0.000    0.000    0.000    0.000    0.000    0.000    0.000    0.000    0.000
   1.000    1.000    1.000    1.000    1.000    1.000    1.000    1.000    1.000    1.000
   1.000    1.000    1.000    1.000    1.000    1.000    1.000    1.000    1.000    1.000
   2.000    2.000    2.000    2.000    2.000    2.000    2.000    2.000    2.000    2.000
   2.000    2.000    2.000    2.000    2.000    2.000    2.000    2.000    2.000    2.000
   3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000
   3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000
   3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000
   3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000    3.000

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM