简体   繁体   English

MPI_Recv期间冻结MPI程序

[英]MPI program freezing during MPI_Recv

I'm a beginner in MPI parallel programming. 我是MPI并行编程的初学者。 I've written this little piece of code to draw the Mandelbrot fracta. 我已经编写了这小段代码来绘制Mandelbrot分形。 The idea is that the first slave will calculate the first half, stick it in a pointer, and send it to master node who is waiting for receiving pointer. 这个想法是,第一个从属计算机将计算前半部分,将其保留在指针中,然后将其发送给等待接收指针的主节点。 The same thing happens for the second node. 第二个节点发生相同的事情。 Finally, the master node should have the result in 2 different variables and will write them in one file. 最后,主节点应将结果包含2个不同的变量,并将它们写入一个文件中。

......
    if((itertab=malloc((sizeof(int)*sizebuffre))) == NULL) { 
        printf("ERREUR , errno : %d (%s) .\n",errno,strerror(errno)); 
        return EXIT_FAILURE; 
    } 
    int rank, size,start,end;

    MPI_Init (&argc, &argv); /* starts MPI */ 
    MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */ 
    MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */
    MPI_Status st;

    /*allocation du tableau de pixel*/ 
    if (rank==1) { 
        xpixel = 0; 
        end = (nbpixelx/MACHINE_NUM); 
        Calcule(xpixel,end); 
        printf("rank 1 start : %d, end : %d\n",xpixel,end); 
        MPI_Send(&itertab,sizebuffre,MPI_INT,0,5,MPI_COMM_WORLD); 
        free(itertab); 
        printf("work done : i am rank 1 \n"); 
    } 
    if (rank==2) {
        xpixel = (nbpixelx/MACHINE_NUM); 
        end = nbpixelx; 
        Calcule(xpixel,end); 
        printf("rank 2 start : %d, end : %d\n",xpixel,end); 
        MPI_Send(&itertab,sizebuffre,MPI_INT,0,6,MPI_COMM_WORLD); 
        printf("work done : i am rank 2 \n"); 
        free(itertab); 
    }

    if (rank==0) { 
        if((itertabA=malloc((sizeof(int)*sizebuffre))) == NULL) { 
            printf("ERREUR d'allocation de itertabA, errno : %d (%s) .\n",errno,strerror(errno)); 
            return EXIT_FAILURE; 
        } 
        if((itertabB=malloc((sizeof(int)*sizebuffre))) == NULL) { 
            printf("ERREUR d'allocation de itertabB, errno : %d (%s) .\n",errno,strerror(errno)); 
            return EXIT_FAILURE; 
        }
        printf("test before reciving result from first slave\n");
        MPI_Recv(itertabA,sizebuffre,MPI_INT,1,5,MPI_COMM_WORLD,&st); 
        printf("result recived  rank 1 \n"); 
        MPI_Recv(itertabB,sizebuffre,MPI_INT,2,6,MPI_COMM_WORLD,&st); 
        printf("result recived rank 2 \n");



    }

    MPI_Finalize(); 
    return EXIT_SUCCESS; 
}

The problem is that my code freezes in the line where the master receives the result from first slave, but I don't know why? 问题是我的代码冻结在主机从第一个从机接收结果的行中,但是我不知道为什么?

I tried to debug the result. 我试图调试结果。 I added some printf to see where it freezes. 我添加了一些printf来查看冻结位置。 This is the result: 结果如下:

test before reciving result from first slave
test in calcule function
trairment xpixel 0
trairment xpixel 1
trairment xpixel 2
...snip...
trairment xpixel 399
test after the end off calculating loop
rank 1 start : 0, end : 400
^C

Your MPI code is not working properly, because you are passing the wrong argument to MPI_Send . 您的MPI代码无法正常工作,因为您将错误的参数传递给MPI_Send Your variable itertab is already a pointer to your data buffer, you thus don't need to de-reference it again. 您的变量itertab已经是指向数据缓冲区的指针,因此您无需再次取消引用它。

Instead of: 代替:

MPI_Send(&itertab,sizebuffre,MPI_INT,0,5,MPI_COMM_WORLD);

do: 做:

MPI_Send(itertab,sizebuffre,MPI_INT,0,5,MPI_COMM_WORLD);

Another issue is that you are accessing non allocated memory, both in your Calcule function, and in the output loop. 另一个问题是您正在Calcule函数和输出循环中都访问未分配的内存。 In the Calcule function, you are writing into itertab[xpixel*nbpixely+ypixel]=iter . Calcule函数中,您正在写入itertab[xpixel*nbpixely+ypixel]=iter This will fail for process 2 , since it allocates only its local part of the itertab buffer. 这将对进程2失败,因为它仅分配itertab缓冲区的本地部分。 You'll need to subtract an offset for xpixel . 您需要减去xpixel的偏移量。

In the output loop, you are reading itertabB with the global index. 在输出循环中,您正在读取带有全局索引的itertabB Here, you should also subtract an offset for xpixel , like so: 在这里,您还应该减去xpixel的偏移量,如下所示:

fprintf(file,"%f %f %d\n", x, y,itertabB[(xpixel-(nbpixelx/MACHINE_NUM))*nbpixely+ypixel]);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM