简体   繁体   中英

Issues installing mxnet GPU R package for Amazon deep learning AMI

I am having trouble installing mxnet GPU for R on Amazon deep learning linux AMI. The environment variables are such a mess that it's a nightmare for any non-expert sys-admin to figure out.

Step 1: install the ridiculous amount of missing/broken programs and R packages

sudo yum install R
sudo yum install libxml2-devel   
sudo yum install cairo-devel
sudo yum install giflib-devel
sudo yum install libXt-devel
sudo R
install.packages("devtools")
library(devtools)
install_github("igraph/rigraph")
install.packages(‘DiagrammeR’) 
install.packages(‘roxygen2’)
install.packages(‘rgexf’)
install.packages(‘influenceR’)
install.packages(‘Cairo’)
install.packages(“imager”)

Step 2: edit the config.mk file

cd /src/mxnet
cp make/config.mk .
echo "USE_BLAS=openblas" >>config.mk
echo "ADD_CFLAGS += -I/usr/include/openblas" >>config.mk
echo "ADD_LDFLAGS += -lopencv_core -lopencv_imgproc -lopencv_imgcodecs" >>config.mk
echo "USE_CUDA=1" >>config.mk
echo "USE_CUDA_PATH=/usr/local/cuda" >>config.mk
echo "USE_CUDNN=1" >>config.mk

*note even though the USE_CUDA_PATH is set, it STILL cannot find libcudart.so and needs to be linked in the make command (shown later)

Step 3: make new config file so make command can find libcudart.so

/etc/ld.so.conf.d/cuda.conf

add /usr/local/cuda-8.0/lib64

sudo ldconfig
  • note this was posted by nvidia but does absolutely nothing to help the make rpkg

Step 4: set up R directories

Rscript -e "install.packages('devtools', repo = 'https://cran.rstudio.com')"
cd R-package
Rscript -e "library(devtools); library(methods); options(repos=c(CRAN='https://cran.rstudio.com'));

install_deps(dependencies = TRUE)" cd ..

step 5: make

cd /src/mxnet
sudo make -j8

Result:

make CXX=g++ DEPS_PATH=/home/ec2-user/src/mxnet/deps -C /home/ec2-user/src/mxnet/ps-lite ps cd /home/ec2-user/src/mxnet/dmlc-core; make libdmlc.a USE_SSE=1 config=/home/ec2-user/src/mxnet/config.mk; cd /home/ec2-user/src/mxnet make[1]: Entering directory /home/ec2-user/src/mxnet/dmlc-core' make[1]: libdmlc.a' is up to date. make[1]: Leaving directory /home/ec2-user/src/mxnet/dmlc-core' make[1]: Entering directory /home/ec2-user/src/mxnet/ps-lite' make[1]: Nothing to be done for ps'. make[1]: Leaving directory ps'. make[1]: Leaving directory /home/ec2-user/src/mxnet/ps-lite' ar crv lib/libmxnet.a

*note, even when changing the config.mk file, the make command always returns 'nothing to update'

Step 6: attempt to make rpkg

Cd /src/mxnet
Sudo make rpkg

Error: Error: package or namespace load failed for 'mxnet': .onLoad failed in loadNamespace() for 'mxnet', details: call: dyn.load(file, DLLpath = DLLpath, ...) error: unable to load shared object '/usr/lib64/R/library/mxnet/libs/libmxnet.so': libcudart.so.8.0: cannot open shared object file: No such file or directory Error: loading failed Execution halted ERROR: loading failed

So it's looking in a location that doesn't exist: /usr/lib64/R/library/mxnet/libs/ When the file actually lives: /home/ec2-user/src/mxnet/R-package/inst/libs/libmxnet.so or /home/ec2-user/src/mxnet/lib/libmxnet.so

What I've tried so far:

sudo LD_LIBRARY_PATH=/usr/local/cuda/lib64 make rpkg

This will fix the missing libcudart.so.8.0 issue but it is simply replace with: libmklml_intel.so: cannot open shared object file: No such file or directory as well as the original 'cannot find libmxnet.so

Also tried: 1. actually creating directories (/usr/lib64/R/library/mxnet/libs/) and then copying libmxnet.so there Result: same error

  1. adding /home/ec2-user/src/mxnet/R-package/inst/libs/ to the make command sudo LD_LIBRARY_PATH=/home/ec2-user/src/mxnet/R-package/inst/libs make rpkg Result: same error

  2. a ridiculous amount of environment labels all of which failed:

    export MXNET_HOME=/usr/lib64/R/library/mxnet/libs/ export MXNET_HOME=/usr/lib64/R/library/mxnet/libs/libmxnet.so
    sudo ldconfig /usr/local/cuda/lib64 sudo ln -s /usr/lib64/R/library/mxnet/libs /usr/lib sudo ln -s /usr/lib64/R/library/mxnet/libs/libmxnet.so /usr/lib sudo ln -s /usr/local/lib/libmklml_intel.so /usr/lib sudo ln -s /usr/local/lib/libiomp5.so /usr/lib sudo ln -s /usr/local /usr/lib export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64/libcudart.so.8.0 export LD_LIBRARY_PATH=/usr/lib64/R/library/mxnet/libs/libmxnet.so /usr/lib export LD_LIBRARY_PATH=/usr/local/cuda-8.0/targets/x86_64-linux/lib/:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64/libcudart.so.8.0

In all ONE of these worked, because I briefly got mxnet R package working before it fell apart again. I've dropped 50+ hours into this installation, which, frankly is ridiculous. Tougher to install the software then it is to program an actual net....

I don't have 5+ years of linux sys admin knowledge so if you'd like please be a bit more helpful then 'fix environment variables.' I can tell that's obviously what's wrong yet have no idea what 'fix environment variables' entails.

To top it off, even after successful install of the R package, it STILL won't work until setting Rstudio server's config file to: rsession-ld-library-path=/opt/local/lib:/usr/local/cuda/lib64

Did you try the following when running any sudo commands.

sudo -E make -j8

This means that it will preserve the env variables when running as superuser. You shouldn't have to add a new config file for the make to find the libraries. Just preserving the env variables using the above command should be enough.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM