MPI_Scatterv和MPI_Gatherv用於多個3D陣列

Question

我是一般編程的新手，尤其是MPI。 我試圖將多個數組從根處理器分散到其他處理器，對這些數組執行一些操作，然后收集數據，但是它將所有數據分散到所有處理器，並且輸出鄰接矩陣不正確，所以我假設是因為我使用了scatterv和/或collectv錯誤。 我不確定是否應該逐個元素分散矩陣，或者是否有辦法分散整個矩陣。 如果您可以看一下我的代碼，將不勝感激。 謝謝！

int rank, size;
MPI_Status status;
MPI_Datatype strip;
bool passflag[Nmats];


MPI::Init();
rank = MPI::COMM_WORLD.Get_rank();
size = MPI::COMM_WORLD.Get_size();
int sendcounts[size], recvcounts, displs[size], rcounts[size];

if(rank == root){

    fin.open(infname);
    fout.open(outfname);
    /* INPUT ADJ-MATS */
    for(i = 0; i < Nmats; i++){
        fin >> dummy;
        for (j = 0; j < N; j++){
            for (k = 0; k < N; k++) {
                fin >> a[i][j][k];
            }
        }
    }
}
/* Nmats = Number of matrices; N = nodes; Nmats isn't divisible by the number of processors */

Nmin= Nmats/size;
Nextra = Nmats%size;
k=0;
for(i=0; i<size; i++){
    if( i < Nextra) sendcounts[i] = Nmin + 1;
    else sendcounts[i] = Nmin;
    displs[i] = k;
    k = k + sendcounts[i];
}
recvcounts = sendcounts[rank];
MPI_Type_vector(Nmin, N, N, MPI_FLOAT, &strip);
MPI_Type_commit(&strip);

MPI_Scatterv(a, sendcounts, displs, strip, a, N*N, strip, 0, MPI_COMM_WORLD);

/* Perform operations on adj-mats */

for(i=0; i<size; i++){
    if(i<Nextra) rcounts[i] = Nmin + 1;
    else rcounts[i] = Nextra;
    displs[i] = k;
    k = k + rcounts[i];

}


MPI_Gatherv(&passflag, 1, MPI::BOOL, &passflag, rcounts , displs, MPI::BOOL, 0, MPI_COMM_WORLD);

MPI::Finalize();
//OUTPUT ADJ_MATS
for(i = 0; i < Nmats; i++) if (passflag[i]) {
    for(j=0;j<N; j++){
        for(k=0; k<N; k++){
            fout << a[i][j][k] << " ";
        }
        fout << endl;
    }
    fout << endl;
}
fout << endl;

嗨，我能夠使代碼適用於靜態分配，但是當我嘗試動態分配代碼時，代碼或多或少會“中斷”。 我不確定是否需要在MPI之外分配內存，或者在初始化MPI之后是否應該這樣做。 我們歡迎所有的建議！

//int a[Nmats][N][N];

/* Prior to adding this part of the code it ran fine, now it's no longer working */ 
int *** a = new int**[Nmats];
for(i = 0; i < Nmats; ++i){
   a[i] = new int*[N];
   for(j = 0; j < N; ++j){
       a[i][j] = new int[N];
       for(k = 0; k < N; k++){
           a[i][j][k] = 0;
       }
           }
               } 

int rank, size;
MPI_Status status;
MPI_Datatype plane;
bool passflag[Nmats];


MPI::Init();
rank = MPI::COMM_WORLD.Get_rank();
size = MPI::COMM_WORLD.Get_size();
MPI_Type_contiguous(N*N, MPI_INT, &plane);
MPI_Type_commit(&plane);

int counts[size], recvcounts, displs[size+1];

if(rank == root){

fin.open(infname);   
fout.open(outfname);
    /* INPUT ADJ-MATS */
for(i = 0; i < Nmats; i++){         
  fin >> dummy;
  for (j = 0; j < N; j++){ 
          for (k = 0; k < N; k++) { 
                  fin >> a[i][j][k];                                              
                }
        }
  }

  } 


Nmin= Nmats/size;
Nextra = Nmats%size;
k=0;
for(i=0; i<size; i++){
   if( i < Nextra) counts[i] = Nmin + 1;
   else counts[i] = Nmin;
   displs[i] = k;
   k = k + counts[i];
}   
recvcounts = counts[rank];
displs[size] = Nmats;                        

MPI_Scatterv(&a[displs[rank]][0][0], counts, displs, plane, &a[displs[rank]][0][0],        recvcounts, plane, 0, MPI_COMM_WORLD);

/* Perform operations on matrices */

MPI_Gatherv(&passflag[displs[rank]], counts, MPI::BOOL, &passflag[displs[rank]], &counts[rank], displs, MPI::BOOL, 0, MPI_COMM_WORLD);

MPI_Type_free(&plane);  
MPI::Finalize();

Answer 1

看來您實際上擁有a是每個N x N元素的Nmat平面。 你的方式索引a卻使其在嵌套循環的元件示出，這些矩陣是在存儲器中連續地布置。 因此，應將a視為Nmat元素的數組，每個元素為N*N化合物。 您只需要注冊一個跨越單個矩陣內存的連續類型：

MPI_Type_contiguous(N*N, MPI_FLOAT, &plane);
MPI_Type_commit(&plane);

使用分散操作的就地模式可以完成數據分散而不在根目錄使用其他數組：

// Perform an in-place scatter
if (rank == 0)
   MPI_Scatterv(a, sendcounts, displs, plane,
                MPI_IN_PLACE, 0, plane, 0, MPI_COMM_WORLD);
   //                         ^^^^^^^^ ignored because of MPI_IN_PLACE
else
   MPI_Scatterv(a, sendcounts, displs, plane,
   //           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ignored by non-root ranks
                a, sendcounts[rank], plane, 0, MPI_COMM_WORLD);
   //              ^^^^^^^^^^^^^^^^ !!!

請注意，每個等級必須通過提供sendcounts[]的相應元素（在您的代碼中固定為N*N ）來指定應接收的正確飛機數量。

就地模式也應在收集操作中使用：

if (rank == 0)
   MPI_Gatherv(MPI_IN_PLACE, 0, MPI_BOOL,
   //                        ^^^^^^^^^^^^ ignored because of MPI_IN_PLACE
               passflag, rcounts, displs, MPI_BOOL, 0, MPI_COMM_WORLD);
else
   MPI_Gatherv(passflag, rcounts[rank], displs, MPI_BOOL,
   //                    ^^^^^^^^^^^^^ !!!
               passflag, rcounts, displs, MPI_BOOL, 0, MPI_COMM_WORLD);
   //          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ignored by non-root ranks

請注意， rcounts和sendcounts具有基本相同的值，您不必計算兩次。 只需調用數組counts並在MPI_Scatterv和MPI_Gatherv調用中使用它。 這同樣適用於值displs -不要計算它們的兩倍，因為他們是相同的。 在第二次計算之前，您似乎也沒有將k設置為零（盡管可能不會在此處發布的代碼中顯示出來）。

MPI_Scatterv和MPI_Gatherv用於多個3D陣列

問題描述

1 個解決方案

解決方案1
0 2014-07-08 22:50:34

MPI_Scatterv和MPI_Gatherv用於多個3D陣列

問題描述

1 個解決方案

解決方案1 0 2014-07-08 22:50:34

解決方案1
0 2014-07-08 22:50:34