简体   繁体   English

所有CUDA设备均用于显示:无法在桌面环境中调试我的CUDA代码

[英]All CUDA devices are used for display: Can not debug my CUDA-code from within desktop environment

since last week I got a big problem with my CUDA-development setup. 从上周开始,我的CUDA开发设置出现了一个大问题。 I have an integrated GPU which I attached my monitors too and an extra NVIDIA Card for running my CUDA kernels on. 我有一个集成的GPU(我也将其连接到显示器上)和一个额外的NVIDIA卡(用于在CUDA内核上运行)。 However, i can not debug my code anymore, because it says: 但是,我不能再调试我的代码,因为它说:

fatal:  All CUDA devices are used for display and cannot be used while debugging. (error code = CUDBG_ERROR_ALL_DEVICES_WATCHDOGGED(0x18)

Somehow it seems that my X-Server is blocking my NVIDIA GPU because if I switch to another virtual console (CTRL+ALT+F1) I am able to run my code using cuda-gdb. 似乎我的X服务器正在阻塞我的NVIDIA GPU,因为如果我切换到另一个虚拟控制台(CTRL + ALT + F1),则可以使用cuda-gdb运行我的代码。 No monitor cable is plugged into the NVIDIA-card... 未将显示器电缆插入NVIDIA卡...

"lsof /dev/nvidia*" does not give any output. “ lsof / dev / nvidia *”不提供任何输出。 I am using Xubuntu 14.04. 我正在使用Xubuntu 14.04。

Does anyone have an idea how to solve this problem? 有谁知道如何解决这个问题?

In devices with compute capability of at least SM35, we can apparently get around this by setting the environment variable 在具有至少SM35的计算能力的设备中,显然可以通过设置环境变量来解决此问题。

CUDA_DEBUGGER_SOFTWARE_PREEMPTION=1

We can see it at the cuda-gdb documentation page: http://docs.nvidia.com/cuda/cuda-gdb/#axzz4BrMPoaoW 我们可以在cuda-gdb文档页面上看到它: http : //docs.nvidia.com/cuda/cuda-gdb/#axzz4BrMP​​oaoW

Here's test. 这是测试。 I am running on a Maxwell Quadro GPU: 我在Maxwell Quadro GPU上运行:

nvidia-smi
Fri Jun 17 10:59:47 2016       
+------------------------------------------------------+                       
| NVIDIA-SMI 352.63     Driver Version: 352.63         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro M4000M       Off  | 0000:01:00.0      On |                  N/A |
| N/A   37C    P8     9W / 100W |    158MiB /  4087MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      2981    G   /usr/bin/X                                      57MiB |
|    0      9186    G   ...ves-passed-by-fd --v8-snapshot-passed-by-    85MiB |
+-----------------------------------------------------------------------------+

Build and run the application 生成并运行应用程序

 nvcc -g -G foo.cu
 cuda-gdb ./a.out 

...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
fatal:  All CUDA devices are used for display and cannot be used while debugging. (error code = CUDBG_ERROR_ALL_DEVICES_WATCHDOGGED(0x18)

Now set the environment variable. 现在设置环境变量。

export CUDA_DEBUGGER_SOFTWARE_PREEMPTION=1
cuda-gdb ./a.out
(cuda-gdb) r
...
warning: Cuda API error detected: cudaMemcpy returned (0xb)

warning: Cuda API error detected: cudaFree returned (0x11)

[Thread 0x7fffed3ff700 (LWP 10302) exited]
[Thread 0x7ffff7fc6780 (LWP 10293) exited]

For me it helped to change in the block 对我来说,它有助于改变区块

Section "ServerLayout"
  Identifier "layout"
  Screen 0 "intel"
EndSection

the Identifier from 'nvidia' to 'intel'. 从“ nvidia”到“ intel”的标识符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM