CUDA的cudaMemcpyToSymbol（）拋出“無效參數”錯誤

Question

問題

我正在嘗試將int數組復制到設備的常量內存中，但我不斷收到以下錯誤：

[錯誤]'無效參數'（11）在'main.cu'第'386行'

編碼

開發了很多代碼，所以我將簡化我的工作。

我已經在main.cu文件的頂部聲明了一個設備__constant__變量，在任何函數之外。

__device__ __constant__ int* dic;

我還有一個宿主變量flatDic ，它在main()以下面的方式進行flatDic ：

int* flatDic = (int *)malloc(num_codewords*(bSizeY*bSizeX)*sizeof(int));

然后我嘗試將flatDic的內容復制到dic ，同樣在main() ：

cudaMemcpyToSymbol(dic, flatDic, num_codewords*(bSizeY*bSizeX)*sizeof(int));

這個cudaMemcpyToSymbol()調用它是main.cu的第386行，它就是拋出上述錯誤的地方。

我試過的

這是我迄今為止嘗試解決問題的方法：

我已經嘗試了以下所有內容，總是返回相同的錯誤：

cudaMemcpyToSymbol(dic, &flatDic, num_codewords*(bSizeY*bSizeX)*sizeof(int));

cudaMemcpyToSymbol(dic, flatDic, num_codewords*(bSizeY*bSizeX)*sizeof(int));

cudaMemcpyToSymbol(dic, &flatDic, num_codewords*(bSizeY*bSizeX)*sizeof(int), 0, cudaMemcpyHostToDevice);

cudaMemcpyToSymbol(dic, flatDic, num_codewords*(bSizeY*bSizeX)*sizeof(int), 0, cudaMemcpyHostToDevice);

在調用cudaMemcpyToSymbol()之前，我還嘗試過cudaMalloc()的dic變量。 cudaMalloc()不會拋出任何錯誤，但cudaMemcpyToSymbol()錯誤仍然存在。

cudaMalloc((void **) &dic, num_codewords*(bSizeY*bSizeX)*sizeof(int));

我也廣泛搜索網絡，文檔，論壇，示例等，但都無濟於事。

有人看到我的代碼有什么問題嗎？ 提前致謝。

Answer 1

cudaMemcpyToSymbol復制到一個常量變量，這里你試圖將int類型的多個字節（一個已分配的ARRAY）復制到int *類型的指針。 這些類型不一樣，因此invalid type 。 為了使這個工作，你需要將一個int （已分配）的ARRAY復制到設備（靜態長度）的ARRAY of int （常量），例如：

__device__ __constant__ int dic[LEN];

來自CUDA C編程指南的示例（我建議您閱讀 - 它非常好！）：

__constant__ float constData[256];
float data[256];
cudaMemcpyToSymbol(constData, data, sizeof(data));
cudaMemcpyFromSymbol(data, constData, sizeof(data));

據我所知，你也可以cudaMemcpyToSymbol一個指向指針的指針（不像你的例子，你將數組復制到指針），但要注意指針將是常量，而不是它指向你設備的內存。 如果你要去這條路線，你需要添加一個cudaMalloc ，然后cudaMemcpyToSymbol將所得到的ptr添加到你的__constant__設備var的設備內存中。 AGAIN，在這種情況下，數組值不會是常量 - 只有指向內存的指針。

您對此案件的要求如下：

int * d_dic;
cudaMalloc((void **) &d_dic, num_codewords*(bSizeY*bSizeX)*sizeof(int));
cudaMemcpyToSymbol(c_dic_ptr, &d_Dic, sizeof(int *));

此外，您應該在調試內部錯誤檢查邏輯中包裝CUDA調用。 我從talonmies借用了以下邏輯：

__inline __host__ void gpuAssert(cudaError_t code, char *file, int line, 
                 bool abort=true)
{
   if (code != cudaSuccess) 
   {
      fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code),
          file, line);
      if (abort) exit(code);
   }
}

#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }

要調用簡單地將CUDA調用包裝在其中，如下所示：

gpuErrchk(cudaMemcpyToSymbol(dic, flatDic, num_codewords*(bSizeY*bSizeX)*sizeof(int)));

如果您遇到分配問題或其他常見錯誤，編程將退出並顯示錯誤消息。

要檢查內核，請執行以下操作：

MyKernel<<<BLK,THRD>>>(vars...);

//Make sure nothing went wrong.
gpuErrchk(cudaPeekAtLastError());
gpuErrchk(cudaDeviceSynchronize());

感謝talonmies的錯誤檢查代碼！

注意：
即使您正在使用vanilla cudaMemcpy ，您的代碼也會失敗，因為您沒有cudaMalloc內存用於您的陣列 - 但在這種情況下，失敗可能是GPU相當於段Unspecified launch failure （可能是Unspecified launch failure ）指針會有一些垃圾值，你會嘗試用該垃圾值給出的地址寫入內存。

CUDA的cudaMemcpyToSymbol（）拋出“無效參數”錯誤

問題描述

1 個解決方案

解決方案1
4 已采納 2012-03-13 03:27:12

CUDA的cudaMemcpyToSymbol（）拋出“無效參數”錯誤

問題描述

1 個解決方案

解決方案1 4 已采納 2012-03-13 03:27:12

解決方案1
4 已采納 2012-03-13 03:27:12