[英]C MPI program calling a Fortran routine crashes
我已經在單個節點上成功編程了矩陣矩陣乘法,現在我的目標是鏈接該程序以在集群節點上並行執行。
主要工作是用我的程序( mydgemm
)部分地計算矩陣矩陣乘法(在本例中為dgemm_
)來更改Scalapack Netlib源代碼中的代碼,並更改原始代碼(ScaLAPACK)。
在這里,原始代碼是C程序,但是該程序中的所有例程都調用Fortran例程(例如dgemm_
是Fortran語言),而我的程序( mydgemm
)是C程序。
修改后,我可以在任何具有矩陣大小的節點上成功執行,但是當我在4個節點(矩陣的大小大於200)上運行時->關於節點(MPI)之間的通信數據有錯誤。
這是一個錯誤:
*BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
PID 69754 RUNNING AT localhost.localdomain
EXIT CODE: 11
CLEANING UP REMAINING PROCESSES
YOU CAN IGNORE THE BELOW CLEANUP MESSAGES*
我只是在主函數中使用MPI在每個節點上創建隨機矩陣(如下所示)-例程稱為new_pdgemm (...)
。 (我在new-pdgemm
里面修改了代碼)。
在mydgemm.c
內部,我使用OMP進行並行處理,並在內核上執行此代碼。
能給我一個指導或想法來解決我的問題嗎?
您是否認為該問題是因為Fortran是列專業,而C是行專業?
還是我需要通過mydgemm.f
更改mydgemm.c
(這真的很難,也許我做不到)?
我的代碼:
int main(int argc, char **argv) {
int i, j, k;
/************ MPI ***************************/
int myrank_mpi, nprocs_mpi;
MPI_Init( &argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank_mpi);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs_mpi);
/************ BLACS ***************************/
int ictxt, nprow, npcol, myrow, mycol,nb;
int info,itemp;
int _ZERO=0,_ONE=1;
int M=20000;
int K=20000;
int N=20000;
nprow = 2; npcol = 2;
nb=1200;
Cblacs_pinfo( &myrank_mpi, &nprocs_mpi ) ;
Cblacs_get( -1, 0, &ictxt );
Cblacs_gridinit( &ictxt, "Row", nprow, npcol );
Cblacs_gridinfo( ictxt, &nprow, &npcol, &myrow, &mycol );
//printf("myrank = %d\n",myrank_mpi);
int rA = numroc_( &M, &nb, &myrow, &_ZERO, &nprow );
int cA = numroc_( &K, &nb, &mycol, &_ZERO, &npcol );
int rB = numroc_( &K, &nb, &myrow, &_ZERO, &nprow );
int cB = numroc_( &N, &nb, &mycol, &_ZERO, &npcol );
int rC = numroc_( &M, &nb, &myrow, &_ZERO, &nprow );
int cC = numroc_( &N, &nb, &mycol, &_ZERO, &npcol );
double *A = (double*) malloc(rA*cA*sizeof(double));
double *B = (double*) malloc(rB*cB*sizeof(double));
double *C = (double*) malloc(rC*cC*sizeof(double));
int descA[9],descB[9],descC[9];
descinit_(descA, &M, &K, &nb, &nb, &_ZERO, &_ZERO, &ictxt, &rA, &info);
descinit_(descB, &K, &N, &nb, &nb, &_ZERO, &_ZERO, &ictxt, &rB, &info);
descinit_(descC, &M, &N, &nb, &nb, &_ZERO, &_ZERO, &ictxt, &rC, &info);
double alpha = 1.0; double beta = 1.0;
double start, end, flops;
srand(time(NULL)*myrow+mycol);
#pragma simd
for (j=0; j<rA*cA; j++)
{
A[j]=((double)rand()-(double)(RAND_MAX)*0.5)/(double)(RAND_MAX);
// printf("A in myrank: %d\n",myrank_mpi);
}
// printf("A: %d\n",myrank_mpi);
#pragma simd
for (j=0; j<rB*cB; j++)
{
B[j]=((double)rand()-(double)(RAND_MAX)*0.5)/(double)(RAND_MAX);
}
#pragma simd
for (j=0; j<rC*cC; j++)
{
C[j]=((double)rand()-(double)(RAND_MAX)*0.5)/(double)(RAND_MAX);
}
MPI_Barrier(MPI_COMM_WORLD);
start=MPI_Wtime();
new_pdgemm ("N", "N", &M , &N , &K , &alpha, A , &_ONE, &_ONE , descA , B , &_ONE, &_ONE , descB , &beta , C , &_ONE, &_ONE , descC );
MPI_Barrier(MPI_COMM_WORLD);
end=MPI_Wtime();
if (myrow==0 && mycol==0)
{
flops = 2 * (double) M * (double) N * (double) K / (end-start) / 1e9;
/* printf("This is value: %d\t%d\t%d\t%d\t%d\t%d\t\n",rA,cA,rB,cB,rC,cC);
printf("%f\t%f\t%f\n", A[4], B[6], C[3]);*/
printf("%f Gflops\n", flops);
}
Cblacs_gridexit( 0 );
MPI_Finalize();
free(A);
free(B);
free(C);
return 0;
}
好的,這並不是真正的答案,但是評論太長了,我仍然希望答案能為您提供格式。
因此,我用注釋中提到的blacs_gridexit修復了該錯誤,即按照例程描述的要求制作了ictxt參數。 然后,我用標准的pdgemm替換了您的例程。 一旦進行了這些更改,並將矩陣尺寸減小到2,000 * 2,000,以適合我的筆記本電腦。 這樣,代碼就可以成功運行,至少在某種意義上說它沒有報告錯誤,並且給出了一種合理的GFlopage。 所以這向我暗示
因此,我將重新安裝您正在使用的庫,確保它們與所使用的編譯器一致,並運行庫提供的測試,並包括您在代碼中省略的頭文件(不要,這些非常重要!)。 如果這些工作正常,我建議這是由於您的代碼中的錯誤所致。 您無法顯示此內容的原因是什么?
無論如何,下面是我成功運行的代碼。 如果我在自己的代碼中正確執行了此操作,那么我肯定會通過在調用函數時確保適當的原型在范圍內來修復所有這些編譯器警告。
ian-admin@agon ~/work/stack/mpi $ cat stack.c
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include "mpi.h"
int main(void) {
int i, j, k;
/************ MPI ***************************/
int myrank_mpi, nprocs_mpi;
MPI_Init( NULL, NULL);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank_mpi);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs_mpi);
/************ BLACS ***************************/
int ictxt, nprow, npcol, myrow, mycol,nb;
int info,itemp;
int _ZERO=0,_ONE=1;
int M=2000;
int K=2000;
int N=2000;
nprow = 2; npcol = 2;
nb=1200;
Cblacs_pinfo( &myrank_mpi, &nprocs_mpi ) ;
Cblacs_get( -1, 0, &ictxt );
Cblacs_gridinit( &ictxt, "Row", nprow, npcol );
Cblacs_gridinfo( ictxt, &nprow, &npcol, &myrow, &mycol );
//printf("myrank = %d\n",myrank_mpi);
int rA = numroc_( &M, &nb, &myrow, &_ZERO, &nprow );
int cA = numroc_( &K, &nb, &mycol, &_ZERO, &npcol );
int rB = numroc_( &K, &nb, &myrow, &_ZERO, &nprow );
int cB = numroc_( &N, &nb, &mycol, &_ZERO, &npcol );
int rC = numroc_( &M, &nb, &myrow, &_ZERO, &nprow );
int cC = numroc_( &N, &nb, &mycol, &_ZERO, &npcol );
double *A = (double*) malloc(rA*cA*sizeof(double));
double *B = (double*) malloc(rB*cB*sizeof(double));
double *C = (double*) malloc(rC*cC*sizeof(double));
int descA[9],descB[9],descC[9];
descinit_(descA, &M, &K, &nb, &nb, &_ZERO, &_ZERO, &ictxt, &rA, &info);
descinit_(descB, &K, &N, &nb, &nb, &_ZERO, &_ZERO, &ictxt, &rB, &info);
descinit_(descC, &M, &N, &nb, &nb, &_ZERO, &_ZERO, &ictxt, &rC, &info);
double alpha = 1.0; double beta = 1.0;
double start, end, flops;
srand(time(NULL)*myrow+mycol);
#pragma simd
for (j=0; j<rA*cA; j++)
{
A[j]=((double)rand()-(double)(RAND_MAX)*0.5)/(double)(RAND_MAX);
// printf("A in myrank: %d\n",myrank_mpi);
}
// printf("A: %d\n",myrank_mpi);
#pragma simd
for (j=0; j<rB*cB; j++)
{
B[j]=((double)rand()-(double)(RAND_MAX)*0.5)/(double)(RAND_MAX);
}
#pragma simd
for (j=0; j<rC*cC; j++)
{
C[j]=((double)rand()-(double)(RAND_MAX)*0.5)/(double)(RAND_MAX);
}
MPI_Barrier(MPI_COMM_WORLD);
start=MPI_Wtime();
pdgemm_ ("N", "N", &M , &N , &K , &alpha, A , &_ONE, &_ONE , descA , B , &_ONE, &_ONE , descB , &beta , C , &_ONE, &_ONE , descC );
MPI_Barrier(MPI_COMM_WORLD);
end=MPI_Wtime();
if (myrow==0 && mycol==0)
{
flops = 2 * (double) M * (double) N * (double) K / (end-start) / 1e9;
/* printf("This is value: %d\t%d\t%d\t%d\t%d\t%d\t\n",rA,cA,rB,cB,rC,cC);
printf("%f\t%f\t%f\n", A[4], B[6], C[3]);*/
printf("%f Gflops\n", flops);
}
Cblacs_gridexit( ictxt );
MPI_Finalize();
free(A);
free(B);
free(C);
return 0;
}
ian-admin@agon ~/work/stack/mpi $ mpicc -g stack.c /home/ian-admin/Downloads/scalapack-2.0.2/libscalapack.a -llapack -lblas -lgfortran
stack.c: In function ‘main’:
stack.c:24:4: warning: implicit declaration of function ‘Cblacs_pinfo’ [-Wimplicit-function-declaration]
Cblacs_pinfo( &myrank_mpi, &nprocs_mpi ) ;
^~~~~~~~~~~~
stack.c:25:4: warning: implicit declaration of function ‘Cblacs_get’ [-Wimplicit-function-declaration]
Cblacs_get( -1, 0, &ictxt );
^~~~~~~~~~
stack.c:26:4: warning: implicit declaration of function ‘Cblacs_gridinit’ [-Wimplicit-function-declaration]
Cblacs_gridinit( &ictxt, "Row", nprow, npcol );
^~~~~~~~~~~~~~~
stack.c:27:4: warning: implicit declaration of function ‘Cblacs_gridinfo’ [-Wimplicit-function-declaration]
Cblacs_gridinfo( ictxt, &nprow, &npcol, &myrow, &mycol );
^~~~~~~~~~~~~~~
stack.c:31:13: warning: implicit declaration of function ‘numroc_’ [-Wimplicit-function-declaration]
int rA = numroc_( &M, &nb, &myrow, &_ZERO, &nprow );
^~~~~~~
stack.c:44:6: warning: implicit declaration of function ‘descinit_’ [-Wimplicit-function-declaration]
descinit_(descA, &M, &K, &nb, &nb, &_ZERO, &_ZERO, &ictxt, &rA, &info);
^~~~~~~~~
stack.c:72:5: warning: implicit declaration of function ‘pdgemm_’ [-Wimplicit-function-declaration]
pdgemm_ ("N", "N", &M , &N , &K , &alpha, A , &_ONE, &_ONE , descA , B , &_ONE, &_ONE , descB , &beta , C , &_ONE, &_ONE , descC );
^~~~~~~
stack.c:83:4: warning: implicit declaration of function ‘Cblacs_gridexit’ [-Wimplicit-function-declaration]
Cblacs_gridexit( ictxt );
^~~~~~~~~~~~~~~
/usr/bin/ld: warning: libgfortran.so.3, needed by //usr/lib/liblapack.so, may conflict with libgfortran.so.5
ian-admin@agon ~/work/stack/mpi $ mpirun -np 4 --oversubscribe ./a.out
9.424291 Gflops
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.