cuda中矢量加法的分段故障

Question

I was messing with a toy program for cuda . 我正在搞乱cuda的玩具程序。

I declare a float array transfer that to gpu and a number to each element of that float array and transfer it back to the host system and print the array. 我声明一个浮点数组传递到gpu和一个数字到该浮点数组的每个元素，并将其传回主机系统并打印数组。 However this is not working out and it is giving me segmentation fault. 然而，这没有成功，它给我分段错误。

Here's code 这是代码

#include <iostream>
using namespace std;

__global__ void kern(float *a, float *C){
    for (int i = 0; i < 3; i++) C[i] = a[i] + i;
}

int main(){
    float *A = new float[3];
    for(int i = 0; i < 3; i++){
        A[i] = i;
    }

    float * d;
    float * C;
    cudaMalloc(&C, sizeof(float)*3);
    cudaMalloc(&d, sizeof(float)*3);
    cudaMemcpy(&d, A, sizeof(float)*3, cudaMemcpyHostToDevice);
    kern<<<1, 1>>>(d, C);

    cudaMemcpy(&A, C, sizeof(float)*3, cudaMemcpyDeviceToHost);

    cout << A[2];

}

Also I am not familiar with Malloc most of my experience was with cpp and hence I am more comfortable with new datatype[]; 另外我对Malloc不熟悉我的大部分经验都是使用cpp，因此我对新的数据类型[]感觉更舒服; is there a equivalent for Cuda? Cuda还有相同的东西吗？

Answer 1

Change this to: 将其更改为：

cudaMemcpy(&d, A, sizeof(float)*3, cudaMemcpyHostToDevice);
cudaMemcpy(&A, C, sizeof(float)*3, cudaMemcpyDeviceToHost);

To this: 对此：

cudaMemcpy(d, A, sizeof(float)*3, cudaMemcpyHostToDevice);
cudaMemcpy(A, C, sizeof(float)*3, cudaMemcpyDeviceToHost);

Also it's always better to store return code by CUDA calls they will give you better idea what going wrong. 此外，通过CUDA调用存储返回代码总是更好，它们可以让您更好地了解出现了什么问题。

cuda中矢量加法的分段故障

问题描述

1 个解决方案

解决方案1
3 已采纳 2014-11-05 11:19:54

cuda中矢量加法的分段故障

问题描述

1 个解决方案

解决方案1 3 已采纳 2014-11-05 11:19:54

解决方案1
3 已采纳 2014-11-05 11:19:54