简体   繁体   English

Keras、Tensorflow、CuDDN 初始化失败

[英]Keras, Tensorflow, CuDDN fails to initialize

I have a very powerful Windows PC (running Windows 10) which has 112GB memory, 16 cores and 3 X Geforce RTX2070 (Doesn't support SLI etc.).我有一台非常强大的 Windows PC(运行 Windows 10),它有 112GB 内存、16 个内核和 3 X Geforce RTX2070(不支持 SLI 等)。 It is running CuDNN 7.5 + Tensorflor 1.13 + Python 3.7它正在运行 CuDNN 7.5 + Tensorflor 1.13 + Python 3.7

My issue is that I am getting the error below - whenever I try to run Keras model for training or to make prediction on a matrix.我的问题是我收到以下错误 - 每当我尝试运行 Keras 模型进行训练或对矩阵进行预测时。 In the beginning I thought it happend only if I ran more that one program simultaneously, but it was not the case, now I am also getting the error when I'm only running a single instance of Keras (often - but not always)一开始我认为只有当我同时运行多个程序时才会发生这种情况,但事实并非如此,现在当我只运行一个 Keras 实例时,我也会遇到错误(通常 - 但并非总是如此)

2019-06-15 19:33:17.878911: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 6317 MB memory) -> physical GPU (device: 2, name: GeForce RTX 2070, pci bus id: 0000:44:00.0, compute capability: 7.5) 2019-06-15 19:33:23.423911: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library cublas64_100.dll locally 2019-06-15 19:33:23.744678: E tensorflow/stream_executor/cuda/cuda_blas.cc:510] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED 2019-06-15 19:33:23.748069: E tensorflow/stream_executor/cuda/cuda_blas.cc:510] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED 2019-06-15 19:33:23.751235: E tensorflow/stream_executor/cuda/cuda_blas.cc:510] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED 2019-06-15 19:33:25.267137: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED 2019-06-15 19:33:25.270582: E te 2019-06-15 19:33:17.878911:我 tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 创建了 TensorFlow 设备(/job:localhost/replica:0/task:0/device:GPU:2 with 6317 MB 内存)-> 物理 GPU(设备:2,名称:GeForce RTX 2070,pci 总线 ID:0000:44:00.0,计算能力:7.5)2019-06-15 19:33:23.423911:我 tensorflow/stream_executor/dso_loader .cc:152] 成功在本地打开 CUDA 库 cublas64_100.dll 2019-06-15 19:33:23.744678: E tensorflow/stream_executor/cuda/cuda_blas.cc:510] 未能创建 cublas 句柄:CUALLOCLAS_100F19-19:33:23.744678 :33:23.748069: E tensorflow/stream_executor/cuda/cuda_blas.cc:510] 未能创建 cublas 句柄: CUBLAS_STATUS_ALLOC_FAILED 2019-06-15 19:33:23.751235/stream.executorcuda_executor5] 未能创建 cublas创建 cublas 句柄:CUBLAS_STATUS_ALLOC_FAILED 2019-06-15 19:33:25.267137: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] 无法创建 cudnn 句柄:CUBLAS_STATUS_207135:E tensorflow/stream_executor/cuda/cuda_dnn.cc:334: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] nsorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED Exception: Failed to get convolution algorithm. nsorflow/stream_executor/cuda/cuda_dnn.cc:334] 无法创建 cudnn 句柄:CUDNN_STATUS_ALLOC_FAILED 异常:无法获得卷积算法。 This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.这可能是因为 cuDNN 初始化失败,所以尝试查看上面是否打印了警告日志消息。 [[{{node conv2d_1/convolution}}]] [[{{node dense_3/Sigmoid}}]] [[{{node conv2d_1/convolution}}]] [[{{nodedense_3/Sigmoid}}]]

On Tensorflow 2.0 and above, you can solve this issue by this way :在 Tensorflow 2.0 及更高版本上,您可以通过以下方式解决此问题:

os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'

or或者

physical_devices = tf.config.experimental.list_physical_devices('GPU')
if len(physical_devices) > 0:
    tf.config.experimental.set_memory_growth(physical_devices[0], True)

Add the following to your code将以下内容添加到您的代码中

from keras.backend.tensorflow_backend import set_session
import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True  # dynamically grow the memory used on the GPU
config.log_device_placement = True  # to log device placement (on which device the operation ran)
sess = tf.Session(config=config)
set_session(sess)  # set this TensorFlow session as the default session for Keras

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM