獲取卷積算法失敗。這可能是因為 cuDNN 初始化失敗，所以嘗試查看是否打印了警告日志消息

Question

我正在 Google colab 上做一個人臉識別項目。 當我嘗試執行以下代碼時

H = model.fit(
    aug.flow(trainX, trainY, batch_size=BS),
    steps_per_epoch=len(trainX) // BS,
    validation_data=(testX, testY),
    validation_steps=len(testX) // BS,
    epochs=EPOCHS)

它給了我這個錯誤

Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[node model/Conv1/Conv2D
 (defined at /usr/local/lib/python3.7/dist-packages/keras/layers/convolutional.py:238)
]] [Op:__inference_train_function_7525]

Errors may have originated from an input operation.
Input Source operations connected to node model/Conv1/Conv2D:
In[0] IteratorGetNext (defined at /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:866)  
In[1] model/Conv1/Conv2D/ReadVariableOp:

它繼續發生的錯誤還有很多..我確實嘗試重新啟動運行時，這個問題的大多數解決方案都在本地機器上。 如果有人知道解決方案，請幫助我

張量流版本 2.7.0 CUDA 版本：11.2

Answer 1

您可以通過調用tf.config.experimental.set_memory_growth來打開內存增長，它嘗試僅分配運行時分配所需的 GPU 內存：它開始分配非常少的內存，然后隨着模型訓練和更多的 GPU 內存需要時，擴展 GPU 內存。要打開特定 GPU 的內存增長，請在分配任何張量或執行任何操作之前使用以下代碼。

def solve_cudnn_error():
    gpus = tf.config.experimental.list_physical_devices('GPU')
    if gpus:
        try:
            # Currently, memory growth needs to be the same across GPUs
            for gpu in gpus:
                tf.config.experimental.set_memory_growth(gpu, True)
            logical_gpus = tf.config.experimental.list_logical_devices('GPU')
            print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
        except RuntimeError as e:
            # Memory growth must be set before GPUs have been initialized
            print(e)

獲取卷積算法失敗。這可能是因為 cuDNN 初始化失敗，所以嘗試查看是否打印了警告日志消息

問題描述

1 個解決方案

解決方案1
0 2022-05-20 02:29:14

獲取卷積算法失敗。 這可能是因為 cuDNN 初始化失敗，所以嘗試查看是否打印了警告日志消息

問題描述

1 個解決方案

解決方案1 0 2022-05-20 02:29:14

獲取卷積算法失敗。這可能是因為 cuDNN 初始化失敗，所以嘗試查看是否打印了警告日志消息

解決方案1
0 2022-05-20 02:29:14