简体   繁体   English

cudaMalloc()上的Cuda未知错误(ErrNo:30)

[英]Cuda Unknown Error(ErrNo: 30) on cudaMalloc()

I have searched for the reason but no luck. 我已经寻找原因,但没有运气。 It fails on such a simple program: 它在这样一个简单的程序上失败:

#include <iostream>

using namespace std;

int main() {
  int* n;
  cout << cudaMallocManaged(&n, 4 * sizeof(int)) << endl;
  return 0;
}

The return code is 30, unknown error. 返回码是30,未知错误。 cudaMalloc also fails with same code. cudaMalloc也会因相同的代码而失败。

This is my hardware: 这是我的硬件:

$ lspci | grep NV
01:00.0 3D controller: NVIDIA Corporation GF117M [GeForce 610M/710M/820M / GT 620M/625M/630M/720M] (rev a1)

$ nvidia-smi
Sat Mar  7 14:02:04 2015       
+------------------------------------------------------+                       
| NVIDIA-SMI 331.113    Driver Version: 331.113        |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  NVS 5200M           Off  | 0000:01:00.0     N/A |                  N/A |
| N/A   53C  N/A     N/A /  N/A |    279MiB /  1023MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|    0            Not Supported                                               |
+-----------------------------------------------------------------------------+

I am using Ubuntu 14.10, with CUDA 6.0 from official repository(hopefully, if Ubuntu does not mess it up). 我正在使用来自官方存储库的CUDA 6.0的Ubuntu 14.10(希望,如果Ubuntu不会弄乱它)。

It is a Lenovo T430s labtop, the card is on Optimus so that might cause some problem. 这是一个联想T430s台式机,该卡在Optimus上,可能会引起一些问题。 I have tested on another machine and the same code works. 我已经在另一台机器上进行了测试,并且相同的代码有效。

Update 1 更新1

OK, nvidia_uvm is not loaded... OK,未加载nvidia_uvm ...

$ lsmod |grep nv

nvidia              10744914  65 
nvram                  14362  1 thinkpad_acpi
drm                   310919  6 i915,drm_kms_helper,nvidia

$ sudo modprobe nvidia_uvm
modprobe: ERROR: ../libkmod/libkmod-module.c:816 kmod_module_insert_module() could not find module by name='nvidia_331_updates_uvm'
modprobe: ERROR: could not insert 'nvidia_331_updates_uvm': Function not implemented

Update 2 更新2

OK, I reinstalled nvidia-331-updates-uvm and the module was loaded. 好的,我重新安装了nvidia-331-updates-uvm并加载了模块。

$ lsmod | grep nv
nvidia_uvm             34855  0 
nvidia              10744914  66 nvidia_uvm
nvram                  14362  1 thinkpad_acpi
drm                   310919  6 i915,drm_kms_helper,nvidia

However, the code still returns error 30. 但是,该代码仍然返回错误30。

Update 3 更新3

After some more testing (mainly tried running as root), now I get error 71: operation not supported. 经过更多测试(主要尝试以root用户身份运行)后,现在出现错误71:不支持该操作。 However, if I am just using cudaMalloc it succeeded. 但是,如果我只是使用cudaMalloc它就成功了。 I will also check whether my device support unified memory addressing. 我还将检查我的设备是否支持统一内存寻址。

Update 4 更新4

OK, my card only supports SM 2.1, so it does not support Unified Memory. 好的,我的卡仅支持SM 2.1,因此不支持统一内存。

AFAIK nvidia_uvm kernel module is required for CUDA to work. CUDA需要AFAIK nvidia_uvm内核模块才能运行。

You need to install package with that kernel module, eg nvidia-331-uvm and enable it's autoloading by installing nvidia-modprobe package: 您需要使用该内核模块安装软件包,例如nvidia-331-uvm并通过安装nvidia-modprobe软件包来启用它的自动加载:

sudo apt-get install nvidia-modprobe nvidia-331-uvm

If you don't want to reboot after installing nvidia-modprobe , you can try to run your program as root (eg sudo ./a.out ) — module should be loaded during run as root. 如果在安装nvidia-modprobe之后不想重启,则可以尝试以root身份运行程序(例如sudo ./a.out在以root身份运行期间应加载模块。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM