I am trying to use Tensorflow-gpu on a jupyter notebook inside a docker containing running on my Ubuntu 18.04 Bionic Beaver server.
I have done the following steps:
1) Installed Nvidia Drivers 390.67 sudo apt-get install nvidia-driver-390
2) Installed CUDA Drivers 9.0 cuda_9.0.176_384.81_linux.run
3) Installed CuDNN 7.0.5 cudnn-9.0-linux-x64-v7.tgz
4) Installed Docker sudo apt install docker-ce
5) Installed nvidia-docker2 sudo apt install nvidia-docker2
I attempt to do the following nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:1.5.1-gpu-py3
The reason i am using Tensorflow 1.5.1 is because i was getting this same Kernel dead error on 1.8.0-gpu-py and i read that you need to use Tensorflow 1.5 for older CPUs. Which i don't think is really the issue since i'm trying to simply import it and i'm using tensorflow-gpu
When i run any cell that imports tensorflow for the first time i get
My server hardware is as follows
CPU: AMD Phenom(tm) II X4 965 Processor
GPU: GeForce GTX 760
Motherboard: ASRock 960GM/U3S3 FX
Memory: G Skill F3-1600C9D-8GAB (8 GB Memory)
How can i determine why the kernel is dying when i simply import tensorflow using import tensorflow as tf
.
Here is the result of nvidia-docker smi
$ docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
Fri Jun 22 17:53:20 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.67 Driver Version: 390.67 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 760 Off | 00000000:01:00.0 N/A | N/A |
| 0% 34C P0 N/A / N/A | 0MiB / 1999MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+
This matches exactly if i use nvidia-smi
outside docker.
Here is the nvcc --version result:
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
If i attempt to do nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:1.5.1-gpu-py3 bash
to bring up a bash prompt and then i enter a python session via python
when i do import tensorflow as tf
i get Illegal instruction (core dumped)
so it isn't working in a non-jupyter environment either. This error still occurs even if i do import numpy
first and then import tensorflow as tf
It turns out i needed to downgrade to tensorflow 1.5.0. 1.5.1 is where AVX was added. AVX instructions are apparently used on module load to set up the library.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.