简体   繁体   English

CMake/CUDA 共享库的可重定位代码

[英]Relocatable Code for a CMake/CUDA shared library

I've stumbled upon a bit of a doozy compile error I've never encountered before.我偶然发现了一些我以前从未遇到过的笨拙的编译错误。 The exact code cannot be shared, so I will present an analogous situation.无法共享确切的代码,所以我将呈现一个类似的情况。 I have a shared library I'm developing, which compiles __device__ tagged device code.我有一个正在开发的共享库,它编译__device__标记的设备代码。 These devices functions must be able to be used by __global__ functions written by the user.这些设备函数必须能够被用户编写的__global__函数使用。 Here is a boiled down set of code which recreates the raised error:这是一组简化的代码,可重新创建引发的错误:

The source code for the shared library: device_function.cu共享库的源代码: device_function.cu

__device__ int deviceFunction()
{
    return 1;
}

The source code for the executable meant to call the device function: soure.cu用于调用设备 function 的可执行文件的源代码: soure.cu

#include <stdio.h>

__device__ int deviceFunction();

__global__ void globalFunction()
{
    printf("%i", deviceFunction());
}

int main()
{
    globalFunction<<<1,1>>>();
    cudaDeviceSynchronize();

    return 0;
}

The CMakeLists.txt file I've attempted to compile everything with:我尝试使用以下命令编译所有内容的 CMakeLists.txt 文件:

cmake_minimum_required(VERSION 3.21)

set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_SOURCE_DIR})
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${CMAKE_SOURCE_DIR})

project(test)

find_package(CUDA REQUIRED)
enable_language(CUDA)

add_library(device_function SHARED device_function.cu)

add_executable(cu_test source.cu)
target_link_libraries(cu_test device_function)

Upon (attempted) compilation, I'm greeted with this:在(尝试)编译后,我收到了这样的问候:

[main] Building folder: relocatable-code 
[build] Starting build
[proc] Executing command: /snap/cmake/current/bin/cmake --build /home/legayone/Documents/research-pathfinding-projects/cuda-programming/relocatable-code/build --config Debug --target all -j 18 --
[build] Consolidate compiler generated dependencies of target device_function
[build] [ 50%] Built target device_function
[build] Consolidate compiler generated dependencies of target cu_test
[build] [ 75%] Building CUDA object CMakeFiles/cu_test.dir/source.cu.o
[build] ptxas fatal   : Unresolved extern function '_Z14deviceFunctionv'
[build] make[2]: *** [CMakeFiles/cu_test.dir/build.make:76: CMakeFiles/cu_test.dir/source.cu.o] Error 255
[build] make[1]: *** [CMakeFiles/Makefile2:111: CMakeFiles/cu_test.dir/all] Error 2
[build] make: *** [Makefile:91: all] Error 2
[build] Build finished with exit code 2

What I've Tried I have scoured the web and arrived at solutions which all ultimately boil down to CUDA_SEPARABLE_COMPILATION ON or some form of -rdc=true or -dc .我试过的我已经搜索了 web 并找到了最终归结为CUDA_SEPARABLE_COMPILATION ON或某种形式的解决方案-rdc=true-dc I have attempted adding separable compilation in the 3 possible combinations for device_function and cu_test , and I've done the same for -rdc=true and -dc , where I try it on one, then the other, then both.我尝试在device_functioncu_test的 3 种可能组合中添加可分离编译,并且我对-rdc=true-dc做了同样的事情,我先试了一个,然后试了另一个,然后两个都试了。 This is the format which I add -rdc=true and -dc to things in:这是我将-rdc=true-dc添加到以下内容的格式:

target_compile_options(cu_test PRIVATE $<$<COMPILE_LANGUAGE:CUDA>:-c "-lcudart -lcudadevrt -lcuda -rdc=true">)

~or~ 〜或〜

target_compile_options(device_function PRIVATE $<$<COMPILE_LANGUAGE:CUDA>:-c "-lcudart -lcudadevrt -lcuda -rdc=true">)

My question What am I missing or what am I doing wrong?我的问题我错过了什么或我做错了什么? I'd really like any executable to just be able to have the shared library device_function linked to it, and for it to be able to call the function inside that shared library.我真的希望任何可执行文件都能够将共享库device_function链接到它,并让它能够在该共享库中调用 function。 In the actual library this applies to, there are headers of course, but I have the includes sorted out:) its just linking.在这适用的实际库中,当然有标题,但我已经整理了包含:) 它只是链接。

Where I suspect the issue is I suspect the issue is code relocatability.我怀疑问题是我怀疑问题是代码可重定位性。 I'm aware special things have to be done to permit device functions from a different compilation unit to be used by an executable (or another library,)?我知道必须做一些特殊的事情才能允许来自不同编译单元的设备功能被可执行文件(或另一个库)使用? but what are those things and how do I do it in CMake?但那些东西是什么,我如何在 CMake 中做到这一点?

A Kind Of Solution一种解决方案

So it would seem that shared library based __device__ functions being called by __global__ functions from a separate compilation unit is not possible .因此,似乎不可能__global__函数从单独的编译单元调用基于共享库的__device__函数。 I should note in this that there is a tremendous amount of conflicting information, specifically from this article: https://developer.nvidia.com/blog/building-cuda-applications-cmake/ , which seems to suggest it is possible, however its presented solutions do not work.我应该注意到,有大量相互矛盾的信息,特别是来自这篇文章: https://developer.nvidia.com/blog/building-cuda-applications-cmake/ ,这似乎表明这是可能的,但是它提出的解决方案不起作用。 Here is what works for me:这对我有用:

CMakeLists.txt CMakeLists.txt

cmake_minimum_required(VERSION 3.21)

set(CMAKE_CUDA_SEPARABLE_COMPILATION ON)
set(CMAKE_POSITION_INDEPENDENT_CODE ON)

project(test LANGUAGES CXX CUDA)

include(CTest)

add_library(device_function STATIC device_function.cu)
add_library(shared_function SHARED shared_device_function.cu)
target_link_libraries(shared_function PUBLIC device_function)

add_executable(cu_test source.cu)
target_link_libraries(cu_test shared_function)

What Works什么有效

  • linking works, as does compilation链接有效,编译也有效
  • the device function in device_function.cu is able to be called from the global function in the executable source source.cu device_function.cu 中的设备function可以从可执行源source.cu中的全局 function 调用

What Does Not Work什么不起作用

  • The actual file containing the device function must be a static library, but it can be linked to a shared library, allowing the user to use the whole library by just a single link to a whole shared library包含设备 function 的实际文件必须是 static 库,但它可以链接到共享库,允许用户通过单个链接到整个共享库来使用整个库
  • This static library method is, apparently, slower than the shared library method这个 static 库方法显然比共享库方法慢

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM