如何訓練分類 CNN？

Question

我目前正在嘗試訓練 model 來進行鳥類識別。 這個 model 稍后將被轉換並托管在 Arduino nano 33 BLE 上，靠近鳥兒來吃的地方。

為了訓練我的 model，我使用了 kaggle API 來使用包含 250 個物種的數據集，分為訓練集、驗證集和測試集。 圖像是.jpg 224x224 RGB。 為了簡化數據標記，我使用了 Keras 預處理工具，它允許我根據他們的文件夾處理 label 數據，這非常有效。

這是預處理：


    from tensorflow.keras.preprocessing.image import ImageDataGenerator
    
    # All images will be augmented
    train_datagen = ImageDataGenerator(
          rescale=1./255,
          rotation_range=40,
          width_shift_range=0.2,
          height_shift_range=0.2,
          shear_range=0.2,
          zoom_range=0.2,
          horizontal_flip=True,
          fill_mode='nearest')
    
    # Flow training images in batches of 128 using train_datagen generator
    train_generator = train_datagen.flow_from_directory(
            '/content/train',  # This is the source directory for training images
            target_size=(224, 224),  # All images will be resized to 150x150
            batch_size=128,
            class_mode='binary',
            color_mode='rgb',
            save_format='jpg')
    
    validation_datagen = ImageDataGenerator(rescale=1/255)
    
    validation_generator = validation_datagen.flow_from_directory(
            '/content/valid',
            target_size=(224, 224),
            class_mode='categorical',
            color_mode='rgb',
            save_format='jpg')

然后我創建了一個帶有卷積和最大池化的 keras model 來處理我的數據，然后我使用了 2 個隱藏層來使用 softmax 激活。 這是我的 model 代碼：


    import tensorflow as tf
    
    model = tf.keras.models.Sequential([
        tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(224, 224, 3)),
        tf.keras.layers.MaxPooling2D(2, 2),
        tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
        tf.keras.layers.MaxPooling2D(2,2),
        tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
        tf.keras.layers.MaxPooling2D(2,2),
        tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
        tf.keras.layers.MaxPooling2D(2,2),
        tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
        tf.keras.layers.MaxPooling2D(2,2),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(512, activation='relu'),
        tf.keras.layers.Dense(250, activation='softmax')
    ])

我面臨的錯誤是：

InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-58-6a14ef1f8bcb> in <module>()
      4       epochs=15,
      5       verbose=1,
----> 6       validation_data=validation_generator)

6 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

InvalidArgumentError:  Can not squeeze dim[1], expected a dimension of 1, got 250
     [[node Squeeze (defined at <ipython-input-58-6a14ef1f8bcb>:6) ]] [Op:__inference_test_function_3788]

Function call stack:
test_function

我的項目的存儲庫： https://github.com/BaptisteZloch/Birds-species-spotting

我希望有人能幫我解決這個問題！

問候，巴蒂斯特·茲洛赫

Answer 1

我目前正在嘗試訓練 model 來進行鳥類識別。 這個 model 稍后將被轉換並托管在 Arduino nano 33 BLE 上，靠近鳥類來吃的地方。

為了訓練我的 model，我使用了 kaggle API 來使用包含 250 個物種的數據集，這些物種分為訓練集、驗證集和測試集。 圖像為.jpg 224x224 RGB。 為了簡化數據標記，我使用了 Keras 預處理工具，它允許我根據他們的文件夾來處理 label 數據，這非常有效。

這是預處理：


    from tensorflow.keras.preprocessing.image import ImageDataGenerator
    
    # All images will be augmented
    train_datagen = ImageDataGenerator(
          rescale=1./255,
          rotation_range=40,
          width_shift_range=0.2,
          height_shift_range=0.2,
          shear_range=0.2,
          zoom_range=0.2,
          horizontal_flip=True,
          fill_mode='nearest')
    
    # Flow training images in batches of 128 using train_datagen generator
    train_generator = train_datagen.flow_from_directory(
            '/content/train',  # This is the source directory for training images
            target_size=(224, 224),  # All images will be resized to 150x150
            batch_size=128,
            class_mode='binary',
            color_mode='rgb',
            save_format='jpg')
    
    validation_datagen = ImageDataGenerator(rescale=1/255)
    
    validation_generator = validation_datagen.flow_from_directory(
            '/content/valid',
            target_size=(224, 224),
            class_mode='categorical',
            color_mode='rgb',
            save_format='jpg')

然后我用卷積和最大池化創建了一個 keras model 來處理我的數據，然后我使用了 2 個隱藏層來使用 softmax 激活。 這是我的 model 代碼：


    import tensorflow as tf
    
    model = tf.keras.models.Sequential([
        tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(224, 224, 3)),
        tf.keras.layers.MaxPooling2D(2, 2),
        tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
        tf.keras.layers.MaxPooling2D(2,2),
        tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
        tf.keras.layers.MaxPooling2D(2,2),
        tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
        tf.keras.layers.MaxPooling2D(2,2),
        tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
        tf.keras.layers.MaxPooling2D(2,2),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(512, activation='relu'),
        tf.keras.layers.Dense(250, activation='softmax')
    ])

我面臨的錯誤是：

InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-58-6a14ef1f8bcb> in <module>()
      4       epochs=15,
      5       verbose=1,
----> 6       validation_data=validation_generator)

6 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

InvalidArgumentError:  Can not squeeze dim[1], expected a dimension of 1, got 250
     [[node Squeeze (defined at <ipython-input-58-6a14ef1f8bcb>:6) ]] [Op:__inference_test_function_3788]

Function call stack:
test_function

我的項目的存儲庫： https://github.com/BaptisteZloch/Birds-species-spotting

我希望有人可以幫助我解決這個問題！

問候， Baptiste ZLOCH

Answer 2

我是您正在使用的數據集的創建者。 你真的不需要太多圖像增強，因為有 35215 張訓練圖像、1250 張測試圖像（每個物種 5 張）和 1250 張驗證圖像（每個物種 5 張）。 所以最多我只會使用 horizontal_flip=True。 所有 rest 將貢獻很少並增加處理時間。 這是一個超級干凈的數據集，其中感興趣的鳥類區域至少占圖像像素的 50%。 在您的 train_gen 中，您應該有 target_size=(128,128) 和 class_mode='categorical'。 你也有 save_format='jpg'。 當您不指定 save_to_dir 時，該參數將被忽略。 很好，你沒有指定它，因為當你訓練時，你會用大量圖像填充該目錄。 在您的 model 中更改 input_shape=(150, 150, 3)。 在下面的代碼中，我添加了兩個回調 early_stop 和 rlronp。 第一個監控驗證損失，如果損失在連續 4 個時期后未能減少，將停止訓練。 它保存 model，其中具有最低驗證損失的時期的權重。 第二個監控驗證損失，如果在一個紀元結束時損失未能減少，則將學習率降低 0.5 倍。 文檔在這里。 工作代碼如下所示：

model.compile(Adam(lr=.001), loss='categorical_crossentropy', metrics=['accuracy']) 
train_dir=r'c:\temp\birds\train' # change this to point to your directory
valid_dir=r'c:\temp\birds\valid' # change this to point to your directory
test_dir=r'c:\temp\birds\test'   # change this to point to your directory
train_gen=ImageDataGenerator(rescale=1/255, horizontal_flip=True).flow_from_directory( train_dir, target_size=(150, 150),
                            batch_size=32, seed=123,  class_mode='categorical', color_mode='rgb',shuffle=True) 
valid_gen=ImageDataGenerator(rescale=1/255).flow_from_directory( valid_dir, target_size=(150, 150),
                            batch_size=32, seed=123,  class_mode='categorical', color_mode='rgb',shuffle=False)
test_gen=ImageDataGenerator(rescale=1/255).flow_from_directory( test_dir, target_size=(150, 150),
                            batch_size=32, seed=123,  class_mode='categorical', color_mode='rgb',shuffle=False) 
early_stop=tf.keras.callbacks.EarlyStopping( monitor="val_loss", patience=4, verbose=1, restore_best_weights=True)
rlronp=tf.keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5,  patience=1, verbose=1)    
history=model.fit(x=train_gen,  epochs=30, verbose=1, callbacks=[early_stop, rlronp],  validation_data=valid_gen,
                       validation_steps=None,  shuffle=True)
performance=model.evaluate( test_gen, batch_size=32, verbose=1, steps=None, )[1] * 100
print('Model accuracy on test set is ', performance, ' %')

使用 250 個類別，您的 model 將無法達到很高的准確度值。 課程越多，問題就越困難。 我會創建一個更復雜的 model，它有更多的卷積層，也許還有一個密集層。 如果你添加一個額外的致密層，包括一個 dropout 層以防止過度擬合。

如何訓練分類 CNN？

問題描述

1 個解決方案

解決方案1
0 2021-01-21 19:47:27

解決方案2
0 2021-01-21 22:24:47

如何訓練分類 CNN？

問題描述

1 個解決方案

解決方案1 0 2021-01-21 19:47:27

解決方案2 0 2021-01-21 22:24:47

解決方案1
0 2021-01-21 19:47:27

解決方案2
0 2021-01-21 22:24:47