简体   繁体   English

是什么导致我的数组地址在传递给函数时被破坏(更改)?

[英]What causes my array address to be corrupted (change) when passed to function?

I am performing Compressed Sparse Raw Matrix Vector multiplications (CSR SPMV): This involves dividing the array A into multiple chunks , then pass this chunk by reference to a function, however only the first part of the array ( A[0] first chunk starting the beginning of the array) is modified.我正在执行压缩稀疏原始矩阵向量乘法(CSR SPMV):这涉及将数组A分成多个,然后通过引用函数传递这个块,但是只有数组的第一部分( A[0]第一个块开始数组的开头)被修改。 However starting from the second loop A[0 + chunkIndex] , when the function reads the sub array it jumps and reads a different address beyond the total array address range, although the indices are correct.然而,从第二个循环A[0 + chunkIndex] 开始,当函数读取子数组时,它会跳转并读取超出总数组地址范围的不同地址,尽管索引是正确的。

For reference:以供参考:

在此处输入图片说明


The SPMV kernel is: SPMV内核是:

void serial_matvec(size_t TS,  double *A, int *JA, int *IA,  double *X, double *Y)
{
    double sum;
    for (int i = 0; i < TS; ++i)
    {   
        sum = 0.0;
        for (int j = IA[i]; j < IA[i + 1]; ++j)
        {
                sum += A[j] * X[JA[j]]; // the error is here , the function reads diffrent 
                                        // address of A,  and JA, so the access 
                                       // will be out-of-bound
            }
            Y[i] = sum;
        }
    }

and it is called this way:它被称为这样:

int chunkIndex = 0;
for(size_t k = 0; k < rows/TS; ++k)
{
    chunkIndex = IA[k * TS];
    serial_matvec(TS, &A[chunkIndex], &JA[chunkIndex], &IA[k*TS], &X[0], &Y[k*TS]);
}

assume I process (8x8) Matrix , and I process 2 rows per chunk , so the loop k will be r ows/TS = 4 loops , the chunkIndex and array passed to the function will be as following:假设我处理(8x8) Matrix ,并且每个块处理2 行,因此循环k将是Rows /TS = 4 loops ,传递给函数的chunkIndex和数组将如下所示:

chunkIndex: 0 --> loop k = 0, &A[0], &JA[0] chunkIndex: 0 --> 循环 k = 0, &A[0], &JA[0]

chunkIndex: --> loop k = 1, &A[16], &JA[16] //[ERROR here, function reads different address] chunkIndex: --> loop k = 1, &A[16], &JA[16] //[此处出错,函数读取不同地址]

chunkIndex: --> loop k = 2, &A[32], &JA[32] //[ERROR here, function reads different address] chunkIndex: --> loop k = 2, &A[32], &JA[32] //[此处出错,函数读取不同地址]

chunkIndex: --> loop k = 3, &A[48], &JA[48] //[ERROR here, function reads different address] chunkIndex: --> loop k = 3, &A[48], &JA[48] //[此处出错,函数读取不同地址]

When I run the code, only the first chunk executes correctly, the other 3 chunks memory are corrupted and the array pointers jump into boundary beyond the array size.当我运行代码时,只有第一个块正确执行,其他 3 个块内存已损坏,数组指针跳入超出数组大小的边界。

I've checked all indices manually, of all the parameter, they are all correct, however when I print the addresses they are not the same.我已经手动检查了所有参数的所有索引,它们都是正确的,但是当我打印地址时,它们不一样。 (debugging this for 3 days now) (现在调试这个 3 天)

I used valgrind and it reported:我使用了valgrind ,它报告了:

Invalid read of size 8 and Use of uninitialised value of size 8 at the sum += A[j] * X[JA[j]];大小 8 的无效读取使用大小 8的未初始化值总和 += A[j] * X[JA[j]]; line线

I compiled it with -g -fsanitize=address and I got我用-g -fsanitize=address编译它,我得到了

heap-buffer-overflow堆缓冲区溢出

I tried to access these chunks manually outside the function, and they are correct, so what can cause the heap memory to be corrupted like this ?我试图在函数之外手动访问这些块,它们是正确的,那么什么会导致堆内存像这样损坏呢?

The code is here , This is the minimum I can do.代码在这里,这是我能做的最低限度。

The problem was that I was using global indices (indices inside main) when indexing the portion of the array ( chunk ) passed to the function, hence the out-of-bound problem.问题是我在索引传递给函数的数组部分chunk时使用了全局索引(main 内的索引 ,因此出现了越界问题。

The solution is to start indexing the sub-arrays from 0 at each function call, but I had another problem.解决方案是在每次函数调用时从0开始索引子数组,但我遇到了另一个问题。 At each function call, I process TS rows, each row has different number of non-zeros.在每次函数调用时,我处理TS行,每一行都有不同数量的非零。

As an example, see the picture, chunk 1 , sorry for my bad handwriting, it is easier this way.举个例子,看图片, chunk 1 ,抱歉我的笔迹不好,这样更容易。 As you can see we will need 3 indices , one for the TS rows proceeded per chunk i , and the other because each row has different number of non-zeros j , and the third one to index the sub-array passed l , which was the original problem.如您所见,我们将需要3 indices ,一个用于每个块i处理的TS行,另一个是因为每行具有不同数量的非零j ,第三个用于索引传递的子数组l ,这是原来的问题。 在此处输入图片说明

and the serial_matvec function will be as following:并且serial_matvec函数将如下所示:

void serial_matvec(size_t TS, const double *A, const int *JA, const int *IA,
                   const double *X, double *Y) {
  int  l = 0;
  for (int i = 0; i < TS; ++i) {
    for (int j = 0; j < (IA[i + 1] - IA[i]); ++j) {
      Y[i] += A[l] * X[JA[l]];
      l++;
    }    
  }
}

The complete code with test is here If anyone has a more elegant solution, you are more than welcome.带有测试的完整代码在这里如果有人有更优雅的解决方案,我们非常欢迎您。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如果函数指针与数组地址一起传递会发生什么 - What happens if function pointer is passed with array's address 函数损坏了我在C中的数组 - the function corrupted my array in C 数组的值和地址相同,但传递给函数时除外? - The value and address of an array are the same, except when passed to a function? 传递给函数的参数已损坏 - Corrupted arguments passed to function C 字符数组在传递到 function 后损坏 - C char array getting corrupted after being passed into function 当我尝试实现一个反转数组的函数时,为什么输入数组和输出数组都损坏了? - Why is both my input array and output array corrupted when I try to implement a function to reverse the array? 更改作为 void* function 参数传递的变量的地址 - Change the address of a variable passed as a void* function parameter 将二维数组的第二个下标传递给函数时,它有什么用? - What is the use of the second subscript of 2-D array, when it is passed to a function? 为什么这个struct literal在VS2013中通过地址而不是gcc / clang传递时会被破坏? - Why is this struct literal getting corrupted when passed by address in VS2013 but not gcc/clang? 当数组是另一个函数的形参时,function 参数中 typedef 的常量大小数组的地址导致指针类型不兼容 - Address of typedef'd constant-sized array in function argument causes incompatible pointer types when array is another function's formal parameter
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM