简体   繁体   中英

Linking error while separating CUDA function into declaration and definition

I'm following instructions provided on NVidia blog post on how to separate functions called from a kernel into declaration and definition. Using CUDA 10 version and Visual Studio compiler produces linking errors. To nvcc compiler's options I've added -dc , as instructed in the referenced post. The files are all located in the same folder under the same project.

test.cuh

__host__ __device__ float test(float, float);

test.cu

#include "test.cuh"    
__host__ __device__ float test(float a, float b)
{
    return a + b;
}

kernel.cu

#include <stdio.h>
#include "test.cuh"
__global__ void addKernel(int *c, const int *a, const int *b)
{
    int i = threadIdx.x;
    c[i] =  test(a[i], b[i]);
}

Linking error

1>kernel.cu.obj : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_41_tmpxft_0000796c_00000000_7_kernel_cpp1_ii_f853efa9 referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
1>test.cuh.obj : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_00006d84_00000000_7_test_cpp1_ii_f2c23be0 referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
1>test.cu.obj : error LNK2019: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_00008044_00000000_7_test_cpp1_ii_f2c23be0 referenced in function "void __cdecl __nv_cudaEntityRegisterCallback(void * *)" (?__nv_cudaEntityRegisterCallback@@YAXPEAPEAX@Z)
1>D:\Workspaces\src\sandbox\cuda_dc\x64\Debug\cuda_dc.exe : fatal error LNK1120: 3 unresolved externals

It doesn't make any difference if change file extensions to ".c", ".cpp", or ".cuh"

These are the steps I followed, using the code you have shown, plus adding a simple main() function so we can have a complete project.

(In Visual Studio)

  1. File..New..Project
  2. on left hand side, scroll down to NVIDIA and select it
  3. select CUDA XY runtime project, give project a name, click OK
  4. at the top menu bar, next to Debug, change x86 to x64
  5. The project should have a default file in it, kernel.cu. Replace the contents of this with (modifying your kernel.cu to add a main function):

     #include <stdio.h> #include "test.cuh" __global__ void addKernel(int *c, const int *a, const int *b) { int i = threadIdx.x; c[i] = test(a[i], b[i]); } int main() { int *c = NULL; int *a = NULL; int *b = NULL; addKernel << <1, 1 >> > (c, a, b); } 

(in windows, eg using file manager)

  1. In the project folder where kernel.cu is located, place your files test.cuh and test.cu (the updated versions you posted without C linkage)

(in visual studio)

  1. Go to the project in the solution explorer windows, right click on the project name, and select Properties
  2. On the left hand side of the dialog, select "CUDA C/C++"
  3. On the right hand side, change the drop-down next to "Generate Relocatable Device Code" from No to Yes
  4. On the left side, select "CUDA linker" and confirm that "Perform device link" is already set to yes
  5. Select OK to close the dialog

  6. Again, in the solution explorer window, right-click on the project name and select add... existing item

  7. A file selection dialog should open. You should see the kernel.cu file plus the test.cu and test.cuh files you added to the folder
  8. Select and add the test.cu file
  9. Now select Build...Rebuild Solution

When I do those steps, I get a clean compilation with no errors:

1>------ Rebuild All started: Project: test37, Configuration: Debug x64 ------
1>
1>  c:\Users\bob-tosh\documents\visual studio 2015\Projects\test37\test37>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\nvcc.exe" -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64"  -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include"  -G   --keep-dir x64\Debug -maxrregcount=0  --machine 64 --compile   -g   -DWIN32 -DWIN64 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /FS /Zi /RTC1 /MDd " -o x64\Debug\kernel.cu.obj "c:\Users\bob-tosh\documents\visual studio 2015\Projects\test37\test37\kernel.cu" -clean
1>CUDACOMPILE : nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
1>  kernel.cu
1>
1>  c:\Users\bob-tosh\documents\visual studio 2015\Projects\test37\test37>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\nvcc.exe" -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64"  -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include"  -G   --keep-dir x64\Debug -maxrregcount=0  --machine 64 --compile   -g   -DWIN32 -DWIN64 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /FS /Zi /RTC1 /MDd " -o x64\Debug\test.cu.obj "c:\Users\bob-tosh\documents\visual studio 2015\Projects\test37\test37\test.cu" -clean
1>CUDACOMPILE : nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
1>  test.cu
1>  Compiling CUDA source file kernel.cu...
1>  Compiling CUDA source file test.cu...
1>
1>  c:\Users\bob-tosh\documents\visual studio 2015\Projects\test37\test37>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" --use-local-env --cl-version 2015 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64" -rdc=true -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include"  -G   --keep-dir x64\Debug -maxrregcount=0  --machine 64 --compile -cudart static  -g   -DWIN32 -DWIN64 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /FS /Zi /RTC1 /MDd " -o x64\Debug\kernel.cu.obj "c:\Users\bob-tosh\documents\visual studio 2015\Projects\test37\test37\kernel.cu"
1>
1>  c:\Users\bob-tosh\documents\visual studio 2015\Projects\test37\test37>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" --use-local-env --cl-version 2015 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64" -rdc=true -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include"  -G   --keep-dir x64\Debug -maxrregcount=0  --machine 64 --compile -cudart static  -g   -DWIN32 -DWIN64 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /FS /Zi /RTC1 /MDd " -o x64\Debug\test.cu.obj "c:\Users\bob-tosh\documents\visual studio 2015\Projects\test37\test37\test.cu"
1>CUDACOMPILE : nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
1>  kernel.cu
1>CUDACOMPILE : nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
1>  test.cu
1>
1>  c:\Users\bob-tosh\documents\visual studio 2015\Projects\test37\test37>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\nvcc.exe" -dlink -o x64\Debug\test37.device-link.obj -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd " -L"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64" cudart.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib  -gencode=arch=compute_20,code=sm_20 -G --machine 64 x64\Debug\kernel.cu.obj x64\Debug\test.cu.obj
1>CUDALINK : nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
1>  cudart.lib
1>  kernel32.lib
1>  user32.lib
1>  gdi32.lib
1>  winspool.lib
1>  comdlg32.lib
1>  advapi32.lib
1>  shell32.lib
1>  ole32.lib
1>  oleaut32.lib
1>  uuid.lib
1>  odbc32.lib
1>  odbccp32.lib
1>  kernel.cu.obj
1>  test.cu.obj
1>  test37.vcxproj -> c:\Users\bob-tosh\documents\visual studio 2015\Projects\test37\x64\Debug\test37.exe
1>  test37.vcxproj -> c:\Users\bob-tosh\documents\visual studio 2015\Projects\test37\x64\Debug\test37.pdb (Full PDB)
1>  copy "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\cudart*.dll" "c:\Users\bob-tosh\documents\visual studio 2015\Projects\test37\x64\Debug\"
1>  C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\cudart32_80.dll
1>  C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\cudart64_80.dll
1>          2 file(s) copied.
========== Rebuild All: 1 succeeded, 0 failed, 0 skipped ==========

My opinion is that if you can follow the above steps exactly, starting with a new project and using the files I indicate, and you get the same results I do, then the problem you are describing in your question relates to something you haven't shown or haven't described. You should then provide a MCVE , and be sure to provide the same level of specifics I have given in my answer. Every step used to create, build, and compile the project, along with the console build output and all files used.

I've used CUDA 8 and Visual Studio 2015, but I don't think there should be substantial differences for what I am describing here with a newer VS and newer CUDA versions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM