我要接受GPU训练吗？

Question

I'm training a neural model with keras and tensorflow as backend. 我正在训练一个以keras和tensorflow作为后端的神经模型。 The log file starts with the following message: 日志文件以以下消息开头：

nohup: ignoring input
2019-02-12 17:44:29.414526: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2019-02-12 17:44:30.191565: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties: 
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
pciBusID: 0000:65:00.0
totalMemory: 7.93GiB freeMemory: 7.81GiB
2019-02-12 17:44:30.191601: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2019-02-12 17:44:30.409790: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-02-12 17:44:30.409828: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971]      0 
2019-02-12 17:44:30.409834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0:   N 
2019-02-12 17:44:30.410015: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7535 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:65:00.0, compute capability: 6.1)

Does this mean that the training is performed on gpu ? 这是否意味着对gpu进行训练？

I would say yes, but when I execute nvtop , I see that all the gpu memory is used while 0% of the gpu calculation capacity is used (see yellow screenshot below): 我会说是的，但是当我执行nvtop ，我看到使用了所有的gpu内存，而使用了0％的gpu计算能力（请参见下面的黄色屏幕截图）：

Also, when I type htop in the command line, I see that one CPU is fully used (see black screenshot below) 另外，当我在命令行中键入htop时，我看到一个CPU已被完全使用（请参见下面的黑色屏幕截图）

How come the gpu memory is used and the cpu capacity calculation is used instead of the gpu capacity calculation ? 为什么使用gpu内存而不使用gpu容量计算来使用cpu容量计算？

Answer 1

I think that you have compiled (or you installed already compiled package) tensorflow with CUDA support, but not with support all instructions available for your CPU (your CPU supports AVX2, AVX512F and FMA instructions that tensorflow can use). 我认为您已经使用CUDA支持编译了（或已安装了已编译的程序包）张量流，但不支持所有适用于您的CPU的指令（您的CPU支持张量流可以使用的AVX2，AVX512F和FMA指令）。

This means, that tensorflow will work fine (with full GPU support), but you can't use your processor at full capacity. 这意味着tensorflow可以正常工作（具有完整的GPU支持），但是您不能满负荷使用处理器。

Try compare time (GPU vs CPU) with this example: https://stackoverflow.com/a/54661896/10418812 尝试使用以下示例比较时间（GPU与CPU）： https : //stackoverflow.com/a/54661896/10418812

我要接受GPU训练吗？

问题描述

1 个解决方案

解决方案1
0 2019-02-12 20:08:56

我要接受GPU训练吗？

问题描述

1 个解决方案

解决方案1 0 2019-02-12 20:08:56

解决方案1
0 2019-02-12 20:08:56