简体   繁体   English

如何获得 cuda 设备中的核心数?

[英]How can I get number of Cores in cuda device?

I am looking for a function that count number of core of my cuda device.我正在寻找一个计算我的 cuda 设备核心数的函数。 I know each microprocessor have specific cores, and my cuda device has 2 microprocessors.我知道每个微处理器都有特定的内核,而我的 cuda 设备有 2 个微处理器。

I searched a lot to find a property function that count number of cores per microprocessor but I couldn't.我进行了大量搜索以找到一个计算每个微处理器内核数的属性函数,但我找不到。 I use the code below but I still need number of cores?我使用下面的代码,但我仍然需要内核数?

  • cuda 7.0 CUDA 7.0
  • program language C程序语言 C
  • visual studio 2013视觉工作室 2013

Code:代码:

void printDevProp(cudaDeviceProp devProp)
{   printf("%s\n", devProp.name);
printf("Major revision number:         %d\n", devProp.major);
printf("Minor revision number:         %d\n", devProp.minor);
printf("Total global memory:           %u", devProp.totalGlobalMem);
printf(" bytes\n");
printf("Number of multiprocessors:     %d\n", devProp.multiProcessorCount);
printf("Total amount of shared memory per block: %u\n",devProp.sharedMemPerBlock);
printf("Total registers per block:     %d\n", devProp.regsPerBlock);
printf("Warp size:                     %d\n", devProp.warpSize);
printf("Maximum memory pitch:          %u\n", devProp.memPitch);
printf("Total amount of constant memory:         %u\n",   devProp.totalConstMem);
return;
}

The cores per multiprocessor is the only "missing" piece of data.每个多处理器的核心数是唯一“缺失”的数据。 That data is not provided directly in the cudaDeviceProp structure, but it can be inferred based on published data and more published data from the devProp.major and devProp.minor entries, which together make up the CUDA compute capability of the device.该数据不直接在cudaDeviceProp结构中提供,但可以根据已发布的数据以及来自devProp.majordevProp.minor条目的更多已发布数据进行推断,它们共同构成了设备的 CUDA计算能力

Something like this should work:这样的事情应该工作:

#include "cuda_runtime_api.h"
// you must first call the cudaGetDeviceProperties() function, then pass 
// the devProp structure returned to this function:
int getSPcores(cudaDeviceProp devProp)
{  
    int cores = 0;
    int mp = devProp.multiProcessorCount;
    switch (devProp.major){
     case 2: // Fermi
      if (devProp.minor == 1) cores = mp * 48;
      else cores = mp * 32;
      break;
     case 3: // Kepler
      cores = mp * 192;
      break;
     case 5: // Maxwell
      cores = mp * 128;
      break;
     case 6: // Pascal
      if ((devProp.minor == 1) || (devProp.minor == 2)) cores = mp * 128;
      else if (devProp.minor == 0) cores = mp * 64;
      else printf("Unknown device type\n");
      break;
     case 7: // Volta and Turing
      if ((devProp.minor == 0) || (devProp.minor == 5)) cores = mp * 64;
      else printf("Unknown device type\n");
      break;
     case 8: // Ampere
      if (devProp.minor == 0) cores = mp * 64;
      else if (devProp.minor == 6) cores = mp * 128;
      else printf("Unknown device type\n");
      break;
     default:
      printf("Unknown device type\n"); 
      break;
      }
    return cores;
}

(coded in browser) (在浏览器中编码)

"cores" is a bit of a marketing term. “核心”是一个营销术语。 The most common connotation in my opinion is to equate it with SP units in the SM.在我看来,最常见的含义是将其与 SM 中的 SP 单位等同起来。 That is the meaning I have demonstrated here.这就是我在这里展示的意思。 I've also omitted cc 1.x devices from this, as those device types are no longer supported in CUDA 7.0 and CUDA 7.5我还省略了 cc 1.x 设备,因为 CUDA 7.0 和 CUDA 7.5 不再支持这些设备类型

A pythonic version is here一个pythonic版本在这里

In linux you can run the following command to get the number of CUDA cores:在 linux 中,您可以运行以下命令来获取 CUDA 核心数:

nvidia-settings -q CUDACores -t

To get the output of this command in C, use the popen function.要在 C 中获取此命令的输出,请使用popen函数。

As Vraj Pandya already said, there is a function ( _ConvertSMVer2Cores ) in the Common/helper_cuda.h file on nvidia's cuda-samples github repository , which provides this functionality.正如 Vraj Pandya 已经说过的,在 nvidia 的 cuda-samples github 存储库的 Common/helper_cuda.h 文件中有一个函数 ( _ConvertSMVer2Cores ),它提供了这个功能。 You just need to multiply its result with the multiprocessor count from the GPU.您只需要将其结果与来自 GPU 的多处理器数量相乘。

Just wanted to provide a current link.只是想提供一个当前链接。

#include <cuda.h>
#include <cuda_runtime.h>
#include <helper_cuda.h> // You need to place this file somewhere where it can be
                         // found by the linker. 
                         // The file itself seems to also require the 
                         // `helper_string.h` file (in the same folder as 
                         // `helper_cuda.h`).

int deviceID;
cudaDeviceProp props;

cudaGetDevice(&deviceID);
cudaGetDeviceProperties(&props, deviceID);
    
int CUDACores = _ConvertSMVer2Cores(props.major, props.minor) * props.multiProcessorCount;

Maybe this might help a bit more.也许这可能会有所帮助。

https://devtalk.nvidia.com/default/topic/470848/cuda-programming-and-performance/what-39-s-the-proper-way-to-detect-sp-cuda-cores-count-per-sm-/post/4414371/#4414371 https://devtalk.nvidia.com/default/topic/470848/cuda-programming-and-performance/what-39-s-the-proper-way-to-detect-sp-cuda-cores-count-per- sm-/post/4414371/#4414371

"there is a library helper_cuda.h which contains a routine _ConvertSMVer2Cores(int major, int minor) which takes the compute capability level of the GPU and returns the number of cores (stream processors) in each SM or SMX" -from the post. “有一个库 helper_cuda.h,其中包含一个例程 _ConvertSMVer2Cores(int major, int minor),它获取 GPU 的计算能力级别并返回每个 SM 或 SMX 中的内核(流处理器)数量”-来自帖子。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM