简体   繁体   中英

Questions about CUDA macro __CUDA_ARCH__

I have a simple cuda code in ttt.cu

#include <iostream>
__global__ void example(){
   printf("__CUDA_ARCH__: %d \n", __CUDA_ARCH__);
}
int main(){
example<<<1,1>>>();
}

with CMakeLists.txt:

cmake_minimum_required(VERSION 3.18)
project(Hello)
find_package(CUDA REQUIRED)

cuda_add_executable(sss ttt.cu)

Then I got the error: identifier "__CUDA_ARCH__" is undefined . I would like to know why does this happen and what should I do for making the __CUDA_ARCH__ valid? And can we use valid __CUDA_ARCH__ in host code within a header .h file?

__CUDA_ARCH__ is a compiler macro .

can we use valid __CUDA_ARCH__ in host code

No, it is intended to be used in device code only :

The host code (the non-GPU code) must not depend on it.

You cannot print a compiler macro the way you are imagining. It is not an ordinary numerical variable defined in C++. You could do something like this but that would print at compile-time, not at run-time.

To print at run-time, you could do something like this:

$ cat t2.cu
#include <cstdio>
#define STR_HELPER(x) #x
#define STR(x) STR_HELPER(x)

__device__ void print_arch(){
  const char my_compile_time_arch[] = STR(__CUDA_ARCH__);
  printf("__CUDA_ARCH__: %s\n", my_compile_time_arch);
}
__global__ void example()
{
   print_arch();
}

int main(){

example<<<1,1>>>();
cudaDeviceSynchronize();
}
$ nvcc -o t2 t2.cu

$ ./t2
__CUDA_ARCH__: 520
$

Note that there are quite a few questions here on the cuda tag discussing __CUDA_ARCH__ , you may wish to review some of them.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM