繁体   English   中英

如何在 google colab 上安装 m.net?

[英]How to install mxnet on google colab?

我正在尝试在 colab 上使用 gpu 安装m.net

我猜当前的 colab 默认安装了cuda 11.1作为

!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0

我尝试了 3 种不同的方法来实现目标,但都没有奏效。

首先尝试 - cuda 11.2,本地 deb

首先,我尝试了 nvidia 文档中的这组命令:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.2.0/local_installers/cuda-repo-ubuntu1804-11-2-local_11.2.0-460.27.04-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-11-2-local_11.2.0-460.27.04-1_amd64.deb
sudo apt-key add /var/cuda-repo-ubuntu1804-11-2-local/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda

虽然安装过程很顺利,但我得到了最新版本cuda ,即 11.4。

第二次尝试 - cuda 11.2,runfile

其次,我尝试了运行文件

!wget https://developer.download.nvidia.com/compute/cuda/11.2.2/local_installers/cuda_11.2.2_460.32.03_linux.run
!sh ./cuda_11.2.2_460.32.03_linux.run --toolkit --silent --override

安装过程很顺利,我想我已经成功安装了 cuda 11.2 作为这个命令

!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

然后我运行了这个命令

!pip install mxnet-cu112

并得到

Collecting mxnet-cu112
  Downloading mxnet_cu112-1.8.0.post0-py2.py3-none-manylinux2014_x86_64.whl (495.7 MB)
     |████████████████████████████████| 495.7 MB 15 kB/s 
Collecting graphviz<0.9.0,>=0.8.1
  Downloading graphviz-0.8.4-py2.py3-none-any.whl (16 kB)
Requirement already satisfied: numpy<2.0.0,>1.16.0 in /usr/local/lib/python3.7/dist-packages (from mxnet-cu112) (1.19.5)
Requirement already satisfied: requests<3,>=2.20.0 in /usr/local/lib/python3.7/dist-packages (from mxnet-cu112) (2.23.0)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.20.0->mxnet-cu112) (3.0.4)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.20.0->mxnet-cu112) (1.24.3)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.20.0->mxnet-cu112) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.20.0->mxnet-cu112) (2021.5.30)
Installing collected packages: graphviz, mxnet-cu112
  Attempting uninstall: graphviz
    Found existing installation: graphviz 0.10.1
    Uninstalling graphviz-0.10.1:
      Successfully uninstalled graphviz-0.10.1
Successfully installed graphviz-0.8.4 mxnet-cu112-1.8.0.post0

最后,我用这个命令测试了安装

import mxnet as mx

我得到了libnvrtc错误

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-7-265f02e9c062> in <module>()
----> 1 import mxnet as mx

4 frames
/usr/lib/python3.7/ctypes/__init__.py in __init__(self, name, mode, handle, use_errno, use_last_error)
    362 
    363         if handle is None:
--> 364             self._handle = _dlopen(self._name, mode)
    365         else:
    366             self._handle = handle

OSError: libnvrtc.so.11.2: cannot open shared object file: No such file or directory

所以,我试图检查图书馆的存在

!find /usr/ -name "libnvrtc*"

我得到了

/usr/local/lib/python3.7/dist-packages/torch/lib/libnvrtc-08c4863f.so.10.2
/usr/local/lib/python3.7/dist-packages/torch/lib/libnvrtc-builtins.so
/usr/local/lib/python2.7/dist-packages/torch/lib/libnvrtc-5e8a26c9.so.10.1
/usr/local/lib/python2.7/dist-packages/torch/lib/libnvrtc-builtins.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc.so.11.2
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc.so.11.2
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc-builtins.so.11.2.152
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc.so.11.2.152
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc-builtins.so.11.2
/usr/local/cuda-11.2/targets/x86_64-linux/lib/nvrtc-prev/libnvrtc-builtins.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/stubs/libnvrtc.so
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.2.152
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc.so.11.2.152
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.2
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libnvrtc-builtins.so
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.0
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc.so.11.0.221
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc.so
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.0.221
/usr/local/cuda-11.0/targets/x86_64-linux/lib/stubs/libnvrtc.so
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc.so.11.0
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libnvrtc-builtins.so
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc.so
/usr/local/cuda-11.1/targets/x86_64-linux/lib/stubs/libnvrtc.so
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc.so.11.1
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.1.105
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc-builtins.so.11.1
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc.so.11.1.105
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc-builtins.so
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc-builtins.so.10.0.130
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc.so
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc.so.10.0.130
/usr/local/cuda-10.0/targets/x86_64-linux/lib/stubs/libnvrtc.so
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc-builtins.so.10.0
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc.so.10.0
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libnvrtc-builtins.so
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc-builtins.so.10.1
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc.so
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc.so.10.1
/usr/local/cuda-10.1/targets/x86_64-linux/lib/stubs/libnvrtc.so
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc.so.10.1.243
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc-builtins.so.10.1.243
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libnvrtc-builtins.so

和另一个命令

%ll /usr/local/cuda/lib64/libnvrtc*

lrwxrwxrwx 1 root       25 Sep 22 00:58 /usr/local/cuda/lib64/libnvrtc-builtins.so -> libnvrtc-builtins.so.11.2*
lrwxrwxrwx 1 root       29 Sep 22 00:57 /usr/local/cuda/lib64/libnvrtc-builtins.so.11.2 -> libnvrtc-builtins.so.11.2.152*
-rwxr-xr-x 1 root  6122648 Sep 22 00:57 /usr/local/cuda/lib64/libnvrtc-builtins.so.11.2.152*
lrwxrwxrwx 1 root       16 Sep 22 00:58 /usr/local/cuda/lib64/libnvrtc.so -> libnvrtc.so.11.2*
lrwxrwxrwx 1 root       20 Sep 22 00:57 /usr/local/cuda/lib64/libnvrtc.so.11.2 -> libnvrtc.so.11.2.152*
-rwxr-xr-x 1 root 43954832 Sep 22 00:57 /usr/local/cuda/lib64/libnvrtc.so.11.2.152*

这是否意味着我已经拥有m.net-cu112需要的库?

我试图为 mxent 指定目录,因为那是“libnvrtc.so.11.2”所在的位置,

%env LD_LIBRARY_PATH=/usr/local/cuda/lib64/

但它也没有用。

我也试过这个

!apt-get install -y libnvrtc=11.2

我得到了这个

Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Unable to locate package libnvrtc

如何修复“libnvrtc”错误?

第三次尝试 - cuda 10.2

我出厂重置了运行时并尝试了这些命令:

!wget https://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
!sh ./cuda_10.2.89_440.33.01_linux.run --toolkit --silent --override
!pip install mxnet-cu102

一切顺利,直到这个命令

import mxnet as mx

OSError: libcudart.so.10.2: cannot open shared object file: No such file or directory

这个命令

%ll /usr/local/cuda/lib64/libcudart*

给这个

lrwxrwxrwx 1 root     17 Sep 22 01:36 /usr/local/cuda/lib64/libcudart.so -> libcudart.so.10.2*
lrwxrwxrwx 1 root     20 Sep 22 01:35 /usr/local/cuda/lib64/libcudart.so.10.2 -> libcudart.so.10.2.89*
-rwxr-xr-x 1 root 509248 Sep 22 01:35 /usr/local/cuda/lib64/libcudart.so.10.2.89*
-rw-r--r-- 1 root 902366 Sep 22 01:36 /usr/local/cuda/lib64/libcudart_static.a

我也试过这个线程,但没有一个对我有用。

我该如何修复错误?

另一种可能的解决方案可能是安装另一个版本的 m.net,尽管它似乎没有CUDA 11.1 的 m.net 二进制文件

以下方法适用于cuda-10.0cuda-11.0

!sudo ln -sfT /usr/local/cuda/cuda-10.0/ /usr/local/cuda
!pip install mxnet-cu100mkl

import mxnet
mxnet.__version__

对于cuda-11.0 ,只需将前两行替换为:

!sudo ln -sfT /usr/local/cuda/cuda-11.0/ /usr/local/cuda
!pip install mxnet-cu110

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM