简体   繁体   English

导入Tensorflow时Jupyter Notebook Kernel死亡

[英]Jupyter Notebook Kernel dies when importing Tensorflow

I am trying to use Tensorflow-gpu on a jupyter notebook inside a docker containing running on my Ubuntu 18.04 Bionic Beaver server. 我正在尝试在包含在我的Ubuntu 18.04 Bionic Beaver服务器上运行的docker内的jupyter笔记本上使用Tensorflow-gpu。

I have done the following steps: 我已完成以下步骤:
1) Installed Nvidia Drivers 390.67 sudo apt-get install nvidia-driver-390 1)已安装N​​vidia驱动程序390.67 sudo apt-get install nvidia-driver-390
2) Installed CUDA Drivers 9.0 cuda_9.0.176_384.81_linux.run 2)安装的CUDA驱动程序9.0 cuda_9.0.176_384.81_linux.run
3) Installed CuDNN 7.0.5 cudnn-9.0-linux-x64-v7.tgz 3)安装了CuDNN 7.0.5 cudnn-9.0-linux-x64-v7.tgz
4) Installed Docker sudo apt install docker-ce 4)安装了Docker sudo apt install docker-ce
5) Installed nvidia-docker2 sudo apt install nvidia-docker2 5)安装nvidia-docker2 sudo apt install nvidia-docker2

I attempt to do the following nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:1.5.1-gpu-py3 我尝试执行以下nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:1.5.1-gpu-py3

The reason i am using Tensorflow 1.5.1 is because i was getting this same Kernel dead error on 1.8.0-gpu-py and i read that you need to use Tensorflow 1.5 for older CPUs. 我使用Tensorflow 1.5.1的原因是因为我在1.8.0-gpu-py上遇到了相同的内核死错误,并且我读到您需要针对较旧的CPU使用Tensorflow 1.5。 Which i don't think is really the issue since i'm trying to simply import it and i'm using tensorflow-gpu 我不认为这是真正的问题,因为我只是尝试将其导入并且使用了tensorflow-gpu

When i run any cell that imports tensorflow for the first time i get 当我第一次运行任何导入张量流的单元格时 此内核死错误。

My server hardware is as follows 我的服务器硬件如下

CPU: AMD Phenom(tm) II X4 965 Processor
GPU: GeForce GTX 760
Motherboard: ASRock 960GM/U3S3 FX
Memory: G Skill F3-1600C9D-8GAB (8 GB Memory)

How can i determine why the kernel is dying when i simply import tensorflow using import tensorflow as tf . 当我简单地使用import tensorflow as tf导入tensorflow时,如何确定内核快死的原因。

Here is the result of nvidia-docker smi 这是nvidia-docker smi的结果

$ docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
Fri Jun 22 17:53:20 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.67                 Driver Version: 390.67                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 760     Off  | 00000000:01:00.0 N/A |                  N/A |
|  0%   34C    P0    N/A /  N/A |      0MiB /  1999MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0                    Not Supported                                       |
+-----------------------------------------------------------------------------+

This matches exactly if i use nvidia-smi outside docker. 如果我nvidia-smi之外使用nvidia-smi这将完全匹配。

Here is the nvcc --version result: 这是nvcc --version结果:

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

If i attempt to do nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:1.5.1-gpu-py3 bash to bring up a bash prompt and then i enter a python session via python when i do import tensorflow as tf i get Illegal instruction (core dumped) so it isn't working in a non-jupyter environment either. 如果我尝试执行nvidia-docker run -it -p 8888:8888 tensorflow/tensorflow:1.5.1-gpu-py3 bash以显示bash提示,然后当我import tensorflow as tf时,我通过python进入了python会话我收到Illegal instruction (core dumped)因此它也无法在非jupyter环境中工作。 This error still occurs even if i do import numpy first and then import tensorflow as tf 即使我先import numpy然后import tensorflow as tf仍然会发生此错误

It turns out i needed to downgrade to tensorflow 1.5.0. 原来我需要降级到tensorflow 1.5.0。 1.5.1 is where AVX was added. 1.5.1是添加AVX的位置。 AVX instructions are apparently used on module load to set up the library. AVX指令显然在模块加载时用于设置库。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM