MPI_Scatter 2d向量

[英]MPI_Scatter 2d vector

I need to pass fragments of vector to all processes to make multiplication operation on matrix . 我需要将向量的片段传递给所有进程,以对矩阵进行乘法运算。 I want to pass a vector (of orginal_size/processes) of vectors. 我想传递矢量(orginal_size / processes)的矢量。

    std::vector<double> Algorytm::mnozenie(std::vector< std::vector <double> >  matrix,std::vector<double> wektor){
        std::vector<double> wynik(matrix.size(),0);
       if (rozmiar_macierzy_==(int)wektor.size()){
           int size=matrix.size();
           int world_size;
           MPI_Comm_size(MPI_COMM_WORLD, &world_size);
          MPI_Bcast(&size,1 , MPI_INT,0,MPI_COMM_WORLD);
          MPI_Bcast(&wektor.front(),wektor.size() , MPI_DOUBLE,0,MPI_COMM_WORLD);

          std::vector< std::vector <double> > fragment_of_matrix(matrix.size()/world_size);
          std::vector <double> tmp(size);
          for(int i=0;i<fragment_of_matrix.size();i++){
          int small_size=matrix.size()*matrix.at(0).size()/world_size;

          std::vector<double> wektor2(wektor.size()/world_size);
         // This works
         MPI_Scatter(&wektor.front(),wektor.size()/world_size , MPI_DOUBLE,&wektor2.front(),wektor.size()/world_size , MPI_DOUBLE, 0,MPI_COMM_WORLD);
         //This doesn't :(
         MPI_Scatter(&matrix.front(),small_size,  MPI_DOUBLE,&fragment_of_matrix.front(), small_size,  MPI_DOUBLE, 0,MPI_COMM_WORLD);

         std::cout << "[ERROR]:"<<std::endl;
     return wynik;

MPI_Scatter(&matrix.front() ... causes error: MPI_Scatter(&matrix.front() ...导致错误:

orrMPI(38789,0x7fff76af2000) malloc: *** error for object 0x3ff0000000000000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
[MBP-2:38789] *** Process received signal ***
[MBP-2:38789] Signal: Abort trap: 6 (6)
[MBP-2:38789] Signal code:  (0)
[MBP-2:38789] [ 0] porrMPI(38788,0x7fff76af2000) malloc: *** error for object 0x7f8b618da200: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
fragment_macierzy rozmiar: 75x150 0   libsystem_platform.dylib            0x00007fff8c29feaa _sigtramp + 26
[MBP-2:38789] [ 1] 0   ???                                 0x0000000000000000 0x0 + 0
[MBP-2:38789] [ 2] 0   libsystem_c.dylib                   0x00007fff975696e7 abort + 129
[MBP-2:38789] [ 3] 0   libsystem_malloc.dylib              0x00007fff8e66c070 free + 425
[MBP-2:38789] [ 4] 0   porrMPI                             0x0000000101e77bed _ZNSt3__113__vector_baseIdNS_9allocatorIdEEED2Ev + 509
[MBP-2:38789] [ 5] 0   porrMPI                             0x0000000101e779e5 _ZNSt3__16vectorIdNS_9allocatorIdEEED2Ev + 21
[MBP-2:38789] [ 6] 0   porrMPI                             0x0000000101e727b5 _ZNSt3__16vectorIdNS_9allocatorIdEEED1Ev + 21
[MBP-2:38789] [ 7] 0   porrMPI                             0x0000000101e77d33 _ZNSt3__113__vector_baseINS_6vectorIdNS_9allocatorIdEEEENS2_IS4_EEED2Ev + 275
[MBP-2:38789] [ 8] 0   porrMPI                             0x0000000101e77c15 _ZNSt3__16vectorINS0_IdNS_9allocatorIdEEEENS1_IS3_EEED2Ev + 21
[MBP-2:38789] [ 9] 0   porrMPI                             0x0000000101e727f5 _ZNSt3__16vectorINS0_IdNS_9allocatorIdEEEENS1_IS3_EEED1Ev + 21
[MBP-2:38788] *** Process received signal ***
[MBP-2:38788] Signal: Abort trap: 6 (6)
[MBP-2:38788] Signal code:  (0)
[MBP-2:38788] [ 0] 0   libsystem_platform.dylib            0x00007fff8c29feaa _sigtramp + 26
[MBP-2:38788] [ 1] 0   ???                                 0x0000ffff00001fa0 0x0 + 281470681751456
[MBP-2:38788] [ 2] 0   libsystem_c.dylib                   0x00007fff975696e7 abort + 129
[MBP-2:38788] [ 3] 0   libsystem_malloc.dylib              0x00007fff8e66c070 free + 425
[MBP-2:38788] [ 4] 0   porrMPI                             0x000000010b4cdbed _ZNSt3__113__vector_baseIdNS_9allocatorIdEEED2Ev + 509
[MBP-2:38788] [ 5] 0   porrMPI                             0x000000010b4cd9e5 _ZNSt3__16vectorIdNS_9allocatorIdEEED2Ev + 21
[MBP-2:38788] [ 6] 0   porrMPI                             0x000000010b4c87b5 _ZNSt3__16vectorIdNS_9allocatorIdEEED1Ev + 21
[MBP-2:38788] [ 7] 0   porrMPI                             0x000000010b4cdd33 _ZNSt3__113__vector_baseINS_6vectorIdNS_9allocatorIdEEEENS2_IS4_EEED2Ev + 275
[MBP-2:38788] [ 8] 0   porrMPI                             0x000000010b4cdc15 _ZNSt3__16vectorINS0_IdNS_9allocatorIdEEEENS1_IS3_EEED2Ev + 21
[MBP-2:38788] [ 9] 0   porrMPI                             0x000000010b4c87f5 _ZNSt3__16vectorINS0_IdNS_9allocatorIdEEEENS1_IS3_EEED1Ev + 21
[MBP-2:38788] [10] 0   porrMPI                             0x000000010b4d48d1 _ZN17AlgorytmCzebyszew19obliczMacierzRownanEv + 8289
[MBP-2:38788] [11] 0   porrMPI                             0x000000010b4d7f78 main + 504
[MBP-2:38788] [12] 0   libdyld.dylib                       0x00007fff976475ad start + 1
[MBP-2:38788] [13] 0   ???                                 0x0000000000000001 0x0 + 1
[MBP-2:38788] *** End of error message ***
[MBP-2:38789] [10] 0   porrMPI                             0x0000000101e8235c _Z5slavev + 700
[MBP-2:38789] [11] 0   porrMPI                             0x0000000101e81e54 main + 212
[MBP-2:38789] [12] 0   libdyld.dylib                       0x00007fff976475ad start + 1
[MBP-2:38789] [13] 0   ???                                 0x0000000000000001 0x0 + 1
[MBP-2:38789] *** End of error message ***
mpiexec noticed that process rank 0 with PID 38788 on node MBP-2 exited on signal 6 (Abort trap: 6).

how to correctly pass chunks of this 2d vector to all processes? 如何正确地将此二维向量的块传递给所有进程?

2D vectors are not contiguous. 2D向量不连续。 Hence, while calculating small_size you might get the exact size of the vector, but the address is not contiguous in the memory. 因此,在计算small_size时,您可能会获得向量的确切大小,但是地址在内存中并不连续。 So, when you try attempting MPI_Scatter , the scatter operation might be performed on an unavailable address. 因此,当您尝试尝试MPI_Scatter时 ,分散操作可能在不可用的地址上执行。

The only option to fix this; 解决此问题的唯一选项;

  1. create a single dimensional vector as temp_matrix from the 2-d vector matrix. 从二维矢量矩阵中创建一维矢量作为temp_matrix。 And, then perform a scatter operation on the temp_matrix. 然后,对temp_matrix执行分散操作。 Make sure the receiver also has a 1-d temp_matrix. 确保接收器还具有一维temp_matrix。 This solution is almost like converting a 2-d vector into an array and performing scatter operation, before moving back to vector from the array. 该解决方案几乎就像将二维向量转换为数组并执行分散操作,然后再从数组移回向量。

But, this would be the most silly solution if the size of the vector is going to be huge. 但是,如果向量的大小将很大,这将是最愚蠢的解决方案。 I hope this shouldn't be an issue in your case, because if one PE(here the master PE 0) can hold the complete matrix vector, then the size shouldn't be that huge. 我希望这在您的情况下不会成为问题,因为如果一个PE(在这里是主PE 0)可以容纳完整的矩阵向量,那么大小就不会那么大。

