[英]Jupyter Notebook Kernel dies when importing Tensorflow
I am trying to use Tensorflow-gpu on a jupyter notebook inside a docker containing running on my Ubuntu 18.04 Bionic Beaver server. 我正在尝试在包含在我的Ubuntu 18.04 Bionic Beaver服务器上运行的docker内的jupyter笔记本上使用Tensorflow-gpu。
I have done the following steps: 我已完成以下步骤:
1) Installed Nvidia Drivers 390.67 sudo apt-get install nvidia-driver-390
1)已安装Nvidia驱动程序390.67
sudo apt-get install nvidia-driver-390
2) Installed CUDA Drivers 9.0 cuda_9.0.176_384.81_linux.run
2)安装的CUDA驱动程序9.0
cuda_9.0.176_384.81_linux.run
3) Installed CuDNN 7.0.5 cudnn-9.0-linux-x64-v7.tgz
3)安装了CuDNN 7.0.5
cudnn-9.0-linux-x64-v7.tgz
4) Installed Docker sudo apt install docker-ce
4)安装了Docker
sudo apt install docker-ce
5) Installed nvidia-docker2 sudo apt install nvidia-docker2
5)安装nvidia-docker2
sudo apt install nvidia-docker2
I attempt to do the following nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:1.5.1-gpu-py3
我尝试执行以下
nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:1.5.1-gpu-py3
The reason i am using Tensorflow 1.5.1 is because i was getting this same Kernel dead error on 1.8.0-gpu-py and i read that you need to use Tensorflow 1.5 for older CPUs. 我使用Tensorflow 1.5.1的原因是因为我在1.8.0-gpu-py上遇到了相同的内核死错误,并且我读到您需要针对较旧的CPU使用Tensorflow 1.5。 Which i don't think is really the issue since i'm trying to simply import it and i'm using tensorflow-gpu
我不认为这是真正的问题,因为我只是尝试将其导入并且使用了tensorflow-gpu
When i run any cell that imports tensorflow for the first time i get 当我第一次运行任何导入张量流的单元格时
My server hardware is as follows 我的服务器硬件如下
CPU: AMD Phenom(tm) II X4 965 Processor
GPU: GeForce GTX 760
Motherboard: ASRock 960GM/U3S3 FX
Memory: G Skill F3-1600C9D-8GAB (8 GB Memory)
How can i determine why the kernel is dying when i simply import tensorflow using import tensorflow as tf
. 当我简单地使用
import tensorflow as tf
导入tensorflow时,如何确定内核快死的原因。
Here is the result of nvidia-docker smi 这是nvidia-docker smi的结果
$ docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
Fri Jun 22 17:53:20 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.67 Driver Version: 390.67 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 760 Off | 00000000:01:00.0 N/A | N/A |
| 0% 34C P0 N/A / N/A | 0MiB / 1999MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+
This matches exactly if i use nvidia-smi
outside docker. 如果我
nvidia-smi
之外使用nvidia-smi
这将完全匹配。
Here is the nvcc --version result: 这是nvcc --version结果:
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
If i attempt to do nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:1.5.1-gpu-py3 bash
to bring up a bash prompt and then i enter a python session via python
when i do import tensorflow as tf
i get Illegal instruction (core dumped)
so it isn't working in a non-jupyter environment either. 如果我尝试执行
nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:1.5.1-gpu-py3 bash
以显示bash提示,然后当我import tensorflow as tf
时,我通过python
进入了python会话我收到Illegal instruction (core dumped)
因此它也无法在非jupyter环境中工作。 This error still occurs even if i do import numpy
first and then import tensorflow as tf
即使我先
import numpy
然后import tensorflow as tf
仍然会发生此错误
It turns out i needed to downgrade to tensorflow 1.5.0. 原来我需要降级到tensorflow 1.5.0。 1.5.1 is where AVX was added.
1.5.1是添加AVX的位置。 AVX instructions are apparently used on module load to set up the library.
AVX指令显然在模块加载时用于设置库。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.