简体   繁体   English

从 cuda 内核打印

[英]printing from cuda kernels

I am writing a cuda program and trying to print something inside the cuda kernels using the printf function.我正在编写一个 cuda 程序并尝试使用 printf 函数在 cuda 内核中打印一些东西。 But when I am compiling the program then I am getting an error但是当我编译程序时,我得到了一个错误

error : calling a host function("printf") from a __device__/__global__ function("agent_movement_top") is not allowed


 error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.2\bin\nvcc.exe" -gencode=arch=compute_10,code=\"sm_10,compute_10\" --use-local-env --cl-version 2008 -ccbin "c:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin" -I"C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\C\common\inc" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.2\include"  -G  --keep-dir "Debug" -maxrregcount=0  --machine 32 --compile  -g    -Xcompiler "/EHsc /nologo /Od /Zi  /MDd  " -o "Debug\test.cu.obj" "C:\Users\umdutta\Desktop\SANKHA_ALL_MATERIALS\PROGRAMMING_FOLDER\ABM_MODELLING_2D_3D\TRY_NUM_2\test_proj_test\test_proj\test_proj\test.cu"" exited with code 2.

I am using the card GTX 560 ti having a compute capability greater than 2.0 and when I have searched a bit about the printing from cuda kernels I also saw that I need to change the compiler from sm_10 to sm_2.0 to take the full advantage of the card.我正在使用计算能力大于 2.0 的卡 GTX 560 ti,当我搜索了一些关于从 cuda 内核打印的信息时,我还发现我需要将编译器从 sm_10 更改为 sm_2.0 以充分利用卡片。 Also some suggested for cuPrintf to serve the purpose.也有人建议 cuPrintf 达到目的。 I am bit confused what should I do and what should be the simplest and quickest way to get the printouts on my console screen.我有点困惑我应该做什么以及在我的控制台屏幕上获取打印输出的最简单和最快的方法应该是什么。 If I need to change the nvcc compiler from 1.0 to 2.0 then what should I do?如果我需要将 nvcc 编译器从 1.0 更改为 2.0,我该怎么办? One more thing I would like to mention that I am using windows 7.0 and programming in visual studio 2010. Thanks for all your help.还有一件事我想提一下,我正在使用 Windows 7.0 并在 Visual Studio 2010 中编程。感谢您的所有帮助。

To enable use of plain printf() on devices of Compute Capability >= 2.0, it's important to compile for CC of at least CC 2.0 and disable the default, which includes a build for CC 1.0.要在 Compute Capability >= 2.0 的设备上启用普通printf() ,重要的是编译 CC 至少为 CC 2.0 并禁用默认值,其中包括 CC 1.0 的构建。

Right-click the .cu file in your project, select Properties , select Configuration Properties |右键单击项目中的.cu文件,选择Properties ,选择Configuration Properties | CUDA C/C++ | CUDA C/C++ | Device . Device Click on the Code Generation line, click the triangle, select Edit .单击Code Generation行,单击三角形,选择Edit In the Code Generation dialog box, uncheck Inherit from parent or project defaults , type compute_20,sm_20 in the top window, click OK.在 Code Generation 对话框中,取消选中Inherit from parent or project defaults ,在顶部窗口中键入compute_20,sm_20 ,单击 OK。

you can write this code to print whatever you want from inside the CUDA Kernel:您可以编写此代码以从 CUDA 内核中打印您想要的任何内容:

# if __CUDA_ARCH__>=200
    printf("%d \n", tid);

#endif  

and include < stdio.h >并包含 <stdio.h>

One way of solving this problem is by using cuPrintf function which is capable of printing from the kernels.解决此问题的一种方法是使用能够从内核打印的 cuPrintf 函数。 Copy the files cuPrintf.cu and cuPrintf.cuh from the folder从文件夹中复制文件cuPrintf.cucuPrintf.cuh

C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.2\C\src\simplePrintf

to the project folder.到项目文件夹。 Then add the header file cuPrintf.cuh to your project and add然后将头文件cuPrintf.cuh添加到您的项目中并添加

#include "cuPrintf.cu"

to your code.到你的代码。 Then your code should be written in a format mentioned below :那么你的代码应该以下面提到的格式编写:

#include "cuPrintf.cu"
__global__ void testKernel(int val)
{
  cuPrintf("Value is: %d\n", val);
}

int main()
{
  cudaPrintfInit();
  testKernel<<< 2, 3 >>>(10);
  cudaPrintfDisplay(stdout, true);
  cudaPrintfEnd();
  return 0;
}

By following the above procedure one can get a print on the console window from the device function.按照上述步骤,可以从设备功能在控制台窗口上打印。 Though I solved my issues in the above mentioned way I still don't have the solution of using printf from the device function.尽管我以上述方式解决了我的问题,但我仍然没有从设备功能中使用printf的解决方案。 If it is true and absolutely necessary to upgrade my nvcc compiler from sm_10 to sm_21 to enable the printf feature then it would be very much helpful if someone could show me the light.如果确实并且绝对有必要将我的 nvcc 编译器从 sm_10 升级到 sm_21 以启用printf功能,那么如果有人可以向我展示这将非常有帮助。 Thanks for all your cooperation感谢大家的合作

I'm have cuda v10.0.130 on Visual Studio 2015 with a GeForce GTX 1060, and all I had to do was add the following include statement: 我在带有GeForce GTX 1060的Visual Studio 2015上有cuda v10.0.130,而我所要做的就是添加以下include语句:

#include <helper_cuda.h>

Then I was able to use the printf statement without any issues. 然后我能够毫无问题地使用printf语句。

I am using GTX 1650 also GTX1050, and c++11.我正在使用 GTX 1650、GTX1050 和 c++11。 For recent users, this is my suggestion:对于最近的用户,这是我的建议:

In host function:在主机功能中:

#include<iostream>
using namespace std;

cout<< .....(anything you want) << endl;

In kernel:在内核中:

if(threadIdx.x==0){
    printf("ss=%4.2f \n", ss);
}

Note that this "if" is quite important and I notice nobody mentioned this.请注意,这个“如果”非常重要,我注意到没有人提到这一点。 Because you might use a lot of threads and you definitely do not want to print too much from every threads.因为您可能会使用很多线程,并且您绝对不想从每个线程中打印太多。 Also 4.2f means 4 points and 2 for decimal. 4.2f 表示 4 点,2 表示小数。 This can prevent print too much 00000. Also do not forget \n to jump line.这可以防止打印过多的 00000。也不要忘记 \n 跳线。

Also you can consider this to print shared memory value:您也可以考虑这样打印共享内存值:

if(threadIdx.x==0){
    for(int i=0;i<64;i++){
        for(int j=0;j<8; j++){
            printf("%4.2f  ", ashare[i*8+j]);
        }
        printf("\n");
    }
    printf("\n");
}

This can print shared memory beautifully.这可以漂亮地打印共享内存。 Notice also need to restrict only in threadIdx.x==0注意也需要限制只在threadIdx.x==0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM