[英]What causes my array address to be corrupted (change) when passed to function?
I am performing Compressed Sparse Raw Matrix Vector multiplications (CSR SPMV): This involves dividing the array A into multiple chunks , then pass this chunk by reference to a function, however only the first part of the array ( A[0] first chunk starting the beginning of the array) is modified.我正在执行压缩稀疏原始矩阵向量乘法(CSR SPMV):这涉及将数组A分成多个块,然后通过引用函数传递这个块,但是只有数组的第一部分( A[0]第一个块开始数组的开头)被修改。 However starting from the second loop A[0 + chunkIndex] , when the function reads the sub array it jumps and reads a different address beyond the total array address range, although the indices are correct.
然而,从第二个循环A[0 + chunkIndex] 开始,当函数读取子数组时,它会跳转并读取超出总数组地址范围的不同地址,尽管索引是正确的。
For reference:以供参考:
The SPMV kernel is: SPMV内核是:
void serial_matvec(size_t TS, double *A, int *JA, int *IA, double *X, double *Y)
{
double sum;
for (int i = 0; i < TS; ++i)
{
sum = 0.0;
for (int j = IA[i]; j < IA[i + 1]; ++j)
{
sum += A[j] * X[JA[j]]; // the error is here , the function reads diffrent
// address of A, and JA, so the access
// will be out-of-bound
}
Y[i] = sum;
}
}
and it is called this way:它被称为这样:
int chunkIndex = 0;
for(size_t k = 0; k < rows/TS; ++k)
{
chunkIndex = IA[k * TS];
serial_matvec(TS, &A[chunkIndex], &JA[chunkIndex], &IA[k*TS], &X[0], &Y[k*TS]);
}
assume I process (8x8) Matrix , and I process 2 rows per chunk , so the loop k will be r ows/TS = 4 loops , the chunkIndex and array passed to the function will be as following:假设我处理(8x8) Matrix ,并且每个块处理2 行,因此循环k将是Rows /TS = 4 loops ,传递给函数的chunkIndex和数组将如下所示:
chunkIndex: 0 --> loop k = 0, &A[0], &JA[0]
chunkIndex: 0 --> 循环 k = 0, &A[0], &JA[0]
chunkIndex: --> loop k = 1, &A[16], &JA[16] //[ERROR here, function reads different address]
chunkIndex: --> loop k = 1, &A[16], &JA[16] //[此处出错,函数读取不同地址]
chunkIndex: --> loop k = 2, &A[32], &JA[32] //[ERROR here, function reads different address]
chunkIndex: --> loop k = 2, &A[32], &JA[32] //[此处出错,函数读取不同地址]
chunkIndex: --> loop k = 3, &A[48], &JA[48] //[ERROR here, function reads different address]
chunkIndex: --> loop k = 3, &A[48], &JA[48] //[此处出错,函数读取不同地址]
When I run the code, only the first chunk executes correctly, the other 3 chunks memory are corrupted and the array pointers jump into boundary beyond the array size.当我运行代码时,只有第一个块正确执行,其他 3 个块内存已损坏,数组指针跳入超出数组大小的边界。
I've checked all indices manually, of all the parameter, they are all correct, however when I print the addresses they are not the same.我已经手动检查了所有参数的所有索引,它们都是正确的,但是当我打印地址时,它们不一样。 (debugging this for 3 days now)
(现在调试这个 3 天)
I used valgrind
and it reported:我使用了
valgrind
,它报告了:
Invalid read of size 8 and Use of uninitialised value of size 8 at the sum += A[j] * X[JA[j]];
大小 8 的无效读取和使用大小 8的未初始化值总和 += A[j] * X[JA[j]]; line
线
I compiled it with -g -fsanitize=address
and I got我用
-g -fsanitize=address
编译它,我得到了
heap-buffer-overflow
堆缓冲区溢出
I tried to access these chunks manually outside the function, and they are correct, so what can cause the heap memory to be corrupted like this ?我试图在函数之外手动访问这些块,它们是正确的,那么什么会导致堆内存像这样损坏呢?
The code is here , This is the minimum I can do.代码在这里,这是我能做的最低限度。
The problem was that I was using global indices (indices inside main) when indexing the portion of the array ( chunk
) passed to the function, hence the out-of-bound problem.问题是我在索引传递给函数的数组部分(
chunk
)时使用了全局索引(main 内的索引) ,因此出现了越界问题。
The solution is to start indexing the sub-arrays from 0
at each function call, but I had another problem.解决方案是在每次函数调用时从
0
开始索引子数组,但我遇到了另一个问题。 At each function call, I process TS
rows, each row has different number of non-zeros.在每次函数调用时,我处理
TS
行,每一行都有不同数量的非零。
As an example, see the picture, chunk 1
, sorry for my bad handwriting, it is easier this way.举个例子,看图片,
chunk 1
,抱歉我的笔迹不好,这样更容易。 As you can see we will need 3 indices
, one for the TS
rows proceeded per chunk i
, and the other because each row has different number of non-zeros j
, and the third one to index the sub-array passed l
, which was the original problem.如您所见,我们将需要
3 indices
,一个用于每个块i
处理的TS
行,另一个是因为每行具有不同数量的非零j
,第三个用于索引传递的子数组l
,这是原来的问题。
and the serial_matvec
function will be as following:并且
serial_matvec
函数将如下所示:
void serial_matvec(size_t TS, const double *A, const int *JA, const int *IA,
const double *X, double *Y) {
int l = 0;
for (int i = 0; i < TS; ++i) {
for (int j = 0; j < (IA[i + 1] - IA[i]); ++j) {
Y[i] += A[l] * X[JA[l]];
l++;
}
}
}
The complete code with test is here If anyone has a more elegant solution, you are more than welcome.带有测试的完整代码在这里如果有人有更优雅的解决方案,我们非常欢迎您。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.