简体   繁体   中英

Using Theano with GPU on Ubuntu 14.04 on AWS g2

I'm having trouble getting Theano to use the GPU on my machine.

When I run: /usr/local/lib/python2.7/dist-packages/theano/misc$ THEANO_FLAGS=floatX=float32,device=gpu python check_blas.py WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available (error: Unable to get the number of gpus available: no CUDA-capable device is detected)

I've also checked that the NVIDIA driver is installed with: lspci -vnn | grep -i VGA -A 12

with result: Kernel driver in use: nvidia

However, when I run: nvidia-smi result: NVIDIA: could not open the device file /dev/nvidiactl (No such file or directory). NVIDIA-SMI has failed because it couldn't communicate with NVIDIA driver. Make sure that latest NVIDIA driver is installed and running.

and /dev/nvidiaactl doesn't exist. What's going on?

UPDATE: /nvidia-smi works with result:

+------------------------------------------------------+
| NVIDIA-SMI 4.304...   Driver Version: 304.116        |
|-------------------------------+----------------------+----------------------+
| GPU  Name                     | Bus-Id        Disp.  | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap| Memory-Usage         | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GRID K520                | 0000:00:03.0     N/A |                  N/A |
| N/A   39C  N/A     N/A /  N/A |   0%   10MB / 4095MB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|    0            Not Supported                                               |
+-----------------------------------------------------------------------------+

and after compiling the NVIDIA_CUDA-6.0_Samples then running deviceQuery I get result:

cudaGetDeviceCount returned 35 -> CUDA driver version is insufficient for CUDA runtime version Result = FAIL

CUDA GPUs in a linux system are not usable until certain "device files" have been properly established.

There is a note to this effect in the documentation .

In general there are several ways these device files can be established:

  1. If an X-server is running.
  2. If a GPU activity is initiated as root user (such as running nvidia-smi, or any CUDA app.)
  3. Via startup scripts (refer to the documentation linked above for an example).

If none of these steps are taken, the GPUs will not be functional for non-root users. Note that the files do not persist through re-boots, and must be re-established on each boot cycle, through one of the 3 above methods. If you use method 2, and reboot, the GPUs will not be available until you use method 2 again.

I suggest reading the linux getting started guide entirely (linked above), if you are having trouble setting up a linux system for CUDA GPU usage.

If you are using CUDA 7.5, make sure follow official instruction: CUDA 7.5 doesn't support the default g++ version. Install an supported version and make it the default.

sudo apt-get install g++-4.9

sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 20
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 10

sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 20
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-5 10

sudo update-alternatives --install /usr/bin/cc cc /usr/bin/gcc 30
sudo update-alternatives --set cc /usr/bin/gcc

sudo update-alternatives --install /usr/bin/c++ c++ /usr/bin/g++ 30
sudo update-alternatives --set c++ /usr/bin/g++

If theano GPU test code has error:

ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: libcublas.so.7.5: cannot open shared object file: No such file or directory WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available (error: cuda unavilable)

Just using ldconfig command to link the shared object of cuda 7.5:

sudo ldconfig /usr/local/cuda-7.5/lib64

I've wasted a lot of hours trying to get AWS G2 to work on ubuntu but failed by getting exact error like you did. Currently I'm running Theano with gpu smoothly with this redhat AMI. To install Theano on Redhat follow the process of Installing Theano in CentOS in Theano documentation.

Had the same problem and reinstalled Cuda and at the end it says i have to update PATH to include /usr/local/cuda7.0/bin and LD_LIBRARY_PATH to include /usr/local/cuda7.0/lib64. The PATH (add LD_LIBRARY_PATH in same file) can be found in /etc/environment. Then theano found gpu. Basic error on my part...

I got

-> CUDA driver version is insufficient for CUDA runtime version

and my problem is related with the selected GPU mode. In other words, the problem may be related to the selected GPU mode (Performance/Power Saving Mode), when you select (with nvidia-settings utility, in the "PRIME Profiles" configurations) the integrated Intel GPU and you execute the deviceQuery script... you get this error:

But this error is misleading, by selecting back the NVIDIA(Performance mode) with nvidia-settings utility the problem disappears.

This is not a version problem .

Regards

Ps: The selection is available when Prime-related-stuff is installed. Further details: https://askubuntu.com/questions/858030/nvidia-prime-in-nvidia-x-server-settings-in-16-04-1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM