编译和链接 Cuda 和 Clang 以在主机代码上支持 c++20

Question

I am trying to compile, link and execute a simple Cuda example using Clang instead of Gcc.我正在尝试使用 Clang 而不是 Gcc 编译、链接和执行一个简单的 Cuda 示例。 The general idea behind using Clang is to allow c++20 in Host Code and more compiler optimizations using the llvm/clang stack.使用 Clang 背后的一般想法是允许在主机代码中使用 c++20，并使用 llvm/clang 堆栈进行更多编译器优化。

I looked at the following sources: llvm docs google paper about gpucc This example is from the llvm documents about compiling cuda with clang我查看了以下来源： llvm docs google paper about gpucc这个例子来自关于使用 clang 编译 cuda 的 llvm 文档

     #include <iostream>

      __global__ void axpy(float a, float* x, float* y) {
     y[threadIdx.x] = a * x[threadIdx.x];
     }

     int main(int argc, char* argv[]) {
    const int kDataLen = 4;

    float a = 2.0f;
    float host_x[kDataLen] = {1.0f, 2.0f, 3.0f, 4.0f};
    float host_y[kDataLen];

    // Copy input data to device.
   float* device_x;
   float* device_y;
   cudaMalloc(&device_x, kDataLen * sizeof(float));
   cudaMalloc(&device_y, kDataLen * sizeof(float));
   cudaMemcpy(device_x, host_x, kDataLen * sizeof(float),
         cudaMemcpyHostToDevice);

   // Launch the kernel.
   axpy<<<1, kDataLen>>>(a, device_x, device_y);

   // Copy output data to host.
   cudaDeviceSynchronize();
   cudaMemcpy(host_y, device_y, kDataLen * sizeof(float),
         cudaMemcpyDeviceToHost);

  // Print the results.
  for (int i = 0; i < kDataLen; ++i) {
  std::cout << "y[" << i << "] = " << host_y[i] << "\n";
  }

 cudaDeviceReset();
  return 0;
  }

The Command to compile and link used is:使用的编译和链接命令是：

clang++-12 axpy.cu -o axpy --cuda-gpu-arch=sm_72     
-L/usr/local/cuda-11.4/lib64 -lcudart -ldl -lrt -pthread axpy.cu 
--cuda-path=/usr/local/cuda-11 --no-cuda-version-check

The output indicates that it successfully compiles but fails to link:输出表明它成功编译但无法链接：

clang: warning: Unknown CUDA version. cuda.h: CUDA_VERSION=11040. Assuming the l                                                                                              atest supported version 10.1 [-Wunknown-cuda-version]
/usr/bin/ld: /tmp/axpy-35c781.o: in function `__device_stub__axpy(float, float*,                                                                                            float*)':
axpy.cu:(.text+0x0): multiple definition of `__device_stub__axpy(float, float*,                                                                                          float*)'; /tmp/axpy-c82a7d.o:axpy.cu:(.text+0x0): first defined here
/usr/bin/ld: /tmp/axpy-35c781.o: in function `main':
axpy.cu:(.text+0xa0): multiple definition of `main'; /tmp/axpy-c82a7d.o:axpy.cu:                                                                                          (.text+0xa0): first defined here
clang: error: linker command failed with exit code 1 (use -v to see invocation)

The error seems to indicate that clang is doing multiple passes over the code to link and wrongfully includes main twice.该错误似乎表明 clang 正在对代码进行多次传递以进行链接并错误地包含 main 两次。

OS: Ubuntu 20.04 Kernel 5.40 Cuda: 11.4 Clang (tried 11/12/13)操作系统：Ubuntu 20.04 内核 5.40 Cuda：11.4 Clang（尝试 11/12/13）

I would be grateful for any hints how to get CUDA and Clang working together.如果您能提供有关如何让 CUDA 和 Clang 协同工作的任何提示，我将不胜感激。 Things I have tried so far: Different Clang versions 11/12/13.到目前为止我尝试过的事情：不同的 Clang 版本 11/12/13。 Different Cuda Versions 11.2/11.4.不同的 Cuda 版本 11.2/11.4。

Answer 1

The output indicates that it successfully compiles but fails to link:输出表明它成功编译但无法链接：

axpy.cu:(.text+0xa0): multiple definition of `main';

This seems to have been sorted in the comments:这似乎已在评论中排序：

Maybe the only problem is that you pass axpy.cu twice in your compile command.也许唯一的问题是您在编译命令中两次传递了 axpy.cu。

clang++-12 axpy.cu -o axpy --cuda-gpu-arch=sm_72 -L/usr/local/cuda-11.4/lib64 -lcudart -ldl -lrt -pthread axpy.cu --cuda-path=/usr/local/cuda-11 --no-cuda-version-check
           ^^^^^^^                                                                                        ^^^^^^^

That was it.就是这样。 thank you.谢谢你。

编译和链接 Cuda 和 Clang 以在主机代码上支持 c++20

问题描述

1 个解决方案

解决方案1
0 已采纳

编译和链接 Cuda 和 Clang 以在主机代码上支持 c++20

问题描述

1 个解决方案

解决方案1 0 已采纳

解决方案1
0 已采纳