[英]MPI_Scatterv: segmentation fault 11 on process 0 only
I'm trying to scatter values among processes belonging to an hypercube group (quicksort project). 我正在尝试将值分散在属于超多维数据集组(快速排序项目)的进程之间。 Depending on the amount of processes I either create a new communicator excluding excessive processes, or I duplicate MPI_COMM_WORLD if it fits exactly any hypercube (power of 2).
根据进程的数量,我要么创建一个新的通信程序(排除过多的进程),要么复制MPI_COMM_WORLD(如果它完全适合任何超立方体(2的幂))。
In both cases, processes other than 0 receive their data, but: - On first scenario, process 0 throws a segmentation fault 11 - On second scenario, nothing faults, but process 0 received values are gibberish. 在这两种情况下,非0的进程都会接收其数据,但是:-在第一种情况下,进程0会引发分段错误11-在第二种情况下,没有任何错误,但是进程0接收到的值是乱码。
NOTE: If I try a regular MPI_Scatter everything works well. 注意:如果我尝试常规的MPI_Scatter,则一切正常。
//Input
vector<int> LoadFromFile();
int d; //dimension of hypercube
int p; //active processes
int idle; //idle processes
vector<int> values; //values loaded
int arraySize; //number of total values to distribute
int main(int argc, char* argv[])
{
int mpiWorldRank;
int mpiWorldSize;
int mpiRank;
int mpiSize;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &mpiWorldRank);
MPI_Comm_size(MPI_COMM_WORLD, &mpiWorldSize);
MPI_Comm MPI_COMM_HYPERCUBE;
d = log2(mpiWorldSize);
p = pow(2, d); //Number of processes belonging to the hypercube
idle = mpiWorldSize - p; //number of processes in excess
int toExclude[idle]; //array of idle processes to exclude from communicator
int sendCounts[p]; //array of values sizes to be sent to processes
//
int i = 0;
while (i < idle)
{
toExclude[i] = mpiWorldSize - 1 - i;
++i;
}
//CREATING HYPERCUBE GROUP: Group of size of power of 2 -----------------
MPI_Group world_group;
MPI_Comm_group(MPI_COMM_WORLD, &world_group);
// Remove excessive processors if any from communicator
if (idle > 0)
{
MPI_Group newGroup;
MPI_Group_excl(world_group, 1, toExclude, &newGroup);
MPI_Comm_create(MPI_COMM_WORLD, newGroup, &MPI_COMM_HYPERCUBE);
//Abort any processor not part of the hypercube.
if (mpiWorldRank > p)
{
cout << "aborting: " << mpiWorldRank <<endl;
MPI_Finalize();
return 0;
}
}
else
{
MPI_Comm_dup(MPI_COMM_WORLD, &MPI_COMM_HYPERCUBE);
}
MPI_Comm_rank(MPI_COMM_HYPERCUBE, &mpiRank);
MPI_Comm_size(MPI_COMM_HYPERCUBE, &mpiSize);
//END OF: CREATING HYPERCUBE GROUP --------------------------
if (mpiRank == 0)
{
//STEP1: Read input
values = LoadFromFile();
arraySize = values.size();
}
//Transforming input vector into an array
int valuesArray[values.size()];
if(mpiRank == 0)
{
copy(values.begin(), values.end(), valuesArray);
}
//Broadcast input size to all processes
MPI_Bcast(&arraySize, 1, MPI_INT, 0, MPI_COMM_HYPERCUBE);
//MPI_Scatterv: determining size of arrays to be received and displacement
int nmin = arraySize / p;
int remainingData = arraySize % p;
int displs[p];
int recvCount;
int k = 0;
for (i=0; i<p; i++)
{
sendCounts[i] = i < remainingData
? nmin+1
: nmin;
displs[i] = k;
k += sendCounts[i];
}
recvCount = sendCounts[mpiRank];
int recvValues[recvCount];
//Following MPI_Scatter works well:
// MPI_Scatter(&valuesArray, 13, MPI_INT, recvValues , 13, MPI_INT, 0, MPI_COMM_HYPERCUBE);
MPI_Scatterv(&valuesArray, sendCounts, displs, MPI_INT, recvValues , recvCount, MPI_INT, 0, MPI_COMM_HYPERCUBE);
int j = 0;
while (j < recvCount)
{
cout << "rank " << mpiRank << " received: " << recvValues[j] << endl;
++j;
}
MPI_Finalize();
return 0;
}
First of all, you are supplying wrong arguments to MPI_Group_excl
: 首先,您向
MPI_Group_excl
提供了错误的参数:
MPI_Group_excl(world_group, 1, toExclude, &newGroup);
// ^
The second argument specifies the number of entries in the exclusion list and should therefore be equal to idle
. 第二个参数指定排除列表中的条目数,因此应等于
idle
。 Since you are excluding a single rank only, the resulting group has mpiWorldSize-1
ranks and hence MPI_Scatterv
expects that both sendCounts[]
and displs[]
have that many elements. 由于仅排除一个等级,因此生成的组具有
mpiWorldSize-1
等级,因此MPI_Scatterv
希望sendCounts[]
和displs[]
都具有这么多的元素。 Of those only p
elements are properly initialised and and the rest are random, therefore MPI_Scatterv
crashes in the root. 其中只有
p
元素被正确初始化,其余元素是随机的,因此MPI_Scatterv
在根目录中崩溃。
Another error is the code that aborts the idle processes: it should read if (mpiWorldRank >= p)
. 另一个错误是中止空闲进程的代码:它应读取
if (mpiWorldRank >= p)
。
I would recommend that the entire exclusion code is replaced by a single call to MPI_Comm_split
instead: 我建议将整个排除代码替换为对
MPI_Comm_split
的单个调用:
MPI_Comm comm_hypercube;
int colour = mpiWorldRank >= p ? MPI_UNDEFINED : 0;
MPI_Comm_split(MPI_COMM_WORLD, colour, mpiWorldRank, &comm_hypercube);
if (comm_hypercube == MPI_COMM_NULL)
{
MPI_Finalize();
return 0;
}
When no process supplies MPI_UNDEFINED
as its colour, the call is equivalent to MPI_Comm_dup
. 当没有进程提供
MPI_UNDEFINED
作为其颜色时,该调用等效于MPI_Comm_dup
。
Note that you should avoid using in your code names starting with MPI_
as those could clash with symbols from the MPI implementation. 请注意,应避免在代码名称中使用以
MPI_
开头的名称,因为它们可能与MPI实现中的符号冲突。
Additional note: std::vector<T>
uses contiguous storage, therefore you could do without copying the elements into a regular array and simply provide the address of the first element in the call to MPI_Scatter(v)
: 补充说明:
std::vector<T>
使用连续存储,因此您可以不将元素复制到常规数组中而只提供对MPI_Scatter(v)
的调用中第一个元素的地址:
MPI_Scatterv(&values[0], ...);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.