Keras 看到我的 GPU 但在訓練神經網絡時不使用它

Question

Keras/TensorFlow 不使用我的 GPU。

為了嘗試使我的 GPU 與 tensorflow 一起工作，我通過 pip 安裝了 tensorflow-gpu（我在 Windows88 上使用 Z855748Z66CF）

我有nvidia 1080ti

print(tf.test.is_gpu_available())

True

print(tf.config.experimental.list_physical_devices())

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), 
 PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

我綁

physical_devices = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)

但它沒有幫助

sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(log_device_placement=True))
print(sess)

Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1

<tensorflow.python.client.session.Session object at 0x000001A2A3BBACF8>

僅來自 tf 的警告：

W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows

整個日志：

2019-10-18 20:06:26.094049: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2019-10-18 20:06:35.078225: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-10-18 20:06:35.090832: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2019-10-18 20:06:35.180744: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
2019-10-18 20:06:35.185505: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-10-18 20:06:35.189328: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-10-18 20:06:35.898592: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-18 20:06:35.901683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2019-10-18 20:06:35.904235: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2019-10-18 20:06:35.906687: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/device:GPU:0 with 8784 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-10-18 20:06:38.694481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
2019-10-18 20:06:38.700482: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-10-18 20:06:38.704020: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
[I 20:06:47.324 NotebookApp] Saving file at /Untitled.ipynb
2019-10-18 20:07:22.227110: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
2019-10-18 20:07:22.246012: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-10-18 20:07:22.261643: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-10-18 20:07:22.272150: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-18 20:07:22.275457: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2019-10-18 20:07:22.277980: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2019-10-18 20:07:22.316260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8784 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1
2019-10-18 20:07:32.986802: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
2019-10-18 20:07:32.990509: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-10-18 20:07:32.993763: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-10-18 20:07:32.995570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-18 20:07:32.997920: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2019-10-18 20:07:32.999435: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2019-10-18 20:07:33.001380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8784 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-10-18 20:07:36.048204: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2019-10-18 20:07:37.971703: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows
Relying on driver to perform ptx compilation. This message will be only logged once.
2019-10-18 20:07:38.576861: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll

還嘗試使用 pip 重新安裝 tensorflow-gpu

為什么我認為 GPU 不起作用？ - 因為我的 python kernel 使用 CPU 99%，RAM 99%，有時 GPU ~7%，但大多數時候它是 0
我使用自定義數據生成器，但現在它只選擇批次並調整它們的大小（skimage.io.resize）1 epoch ~ 44s 還有每~10個樣本在隨機點凍結的奇怪行為，並且在最后一個樣本上幾乎沒有凍結（37/38）（ ~10-15 秒）

編輯：

我在這里發布我的自定義數據生成

train_gen = DataGenerator(x = x_train,
                              y = y_train,
                              batch_size = 128,
                              target_shape = (100, 100, 3), 
                              sample_std = False,
                              feature_std = False,
                              proj_parameters = None,
                              blur_parameters = None,
                              nois_parameters = None,
                              flip_parameters = None,
                              gamm_parameters = None)

驗證是一樣的

更新：

所以它是一個引起問題的生成器，但我該如何解決呢？
我只使用了 skimage 和 numpy 操作

Answer 1

日志顯示 GPU 確實被使用了。 您幾乎肯定會遇到 IO 瓶頸：您的 GPU 正在處理 CPU 拋出的任何內容，其速度比 CPU 加載和預處理它的速度要快。 這在深度學習中很常見，並且有一些方法可以解決它。

如果不了解您的數據管道（批處理的字節大小、預處理步驟等）以及數據的存儲方式，我們無法提供很多幫助。 加快速度的一種典型方法是將數據存儲為二進制格式，例如TFRecords ，以便 CPU 可以更快地加載它。 請參閱官方文檔。

編輯：我快速瀏覽了您的輸入管道。 這個問題很可能確實是由 IO 引起的：

您還應該在 GPU 上運行預處理步驟，您使用的許多增強技術都在tf.image中實現。 如果可以的話，你應該考慮使用 Tensorflow 2.0，因為它包括 Keras 並且里面也有很多助手。
查看tf.data.Dataset API，它有很多助手可以將所有數據加載到不同的線程中，這可以通過您擁有的內核數量大致加快進程。
您應該將圖像存儲為TFRecords 。 如果您的輸入圖像很小，這可能會將加載速度提高一個數量級。
您也可以嘗試更大的批量，我認為您的圖像可能真的很小。

Keras 看到我的 GPU 但在訓練神經網絡時不使用它

問題描述

編輯：

更新：

1 個解決方案

解決方案1
4 已采納 2019-10-18 17:48:58

Keras 看到我的 GPU 但在訓練神經網絡時不使用它

問題描述

編輯：

更新：

1 個解決方案

解決方案1 4 已采納 2019-10-18 17:48:58

解決方案1
4 已采納 2019-10-18 17:48:58