如何增加/提高分類報告的准確率、召回率和 F1-Score？

Question

import tensorflow as tf 
from tensorflow import keras
from tensorflow.keras import layers
import matplotlib.pyplot as plt
import numpy as np
from sklearn.metrics import confusion_matrix
import pandas as pd
from tensorflow.keras.preprocessing.image import ImageDataGenerator
>tf._version_

2.7.0

from google.colab import drive\
drive.mount('/content/gdrive')

驅動器已安裝在 /content/gdrive； 要嘗試強制重新掛載，請調用 drive.mount("/content/gdrive", force_remount=True)。

batch_size = 32
train_ds = tf.keras.utils.image_dataset_from_directory(
    '/content/gdrive...',
    validation_split=0.2,
    subset="training",
    seed=1337,
    image_size=image_size,
    batch_size=batch_size,
    )

val_ds = tf.keras.utils.image_dataset_from_directory(
    '/content/gdrive...',
    validation_split=0.2,
    subset="validation",
    seed=1337,
    image_size=image_size,
    batch_size=batch_size,
)

test_ds = tf.keras.utils.image_dataset_from_directory(
    '/content/gdrive...',
    image_size=image_size,
    batch_size=batch_size,
    )

test_datagen = ImageDataGenerator(rescale = 1./255.,)

test_generator = test_datagen.flow_from_directory(
    '/content/gdrive...',
    shuffle=False,  
    class_mode = 'categorical', 
    )

找到屬於 4 個類的 494 個文件。
使用 396 個文件進行訓練。
找到屬於 4 個類的 494 個文件。
使用 98 個文件進行驗證。
找到屬於 4 個類的 60 個文件。
找到屬於 4 個類別的 60 張圖像。\

print(class_names)

['籃球'，'足球'，'排球'，'水球']

plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
  for i in range(16):
    ax = plt.subplot(4, 4, i + 1)
    plt.imshow(images[i].numpy().astype("uint8"))
    plt.title(class_names[labels[i]])
    plt.axis("off")

圖片

normalization_layer = tf.keras.layers.Rescaling(1./255)

normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
image_batch, labels_batch = next(iter(normalized_ds))
first_image = image_batch[0]
print(np.min(first_image), np.max(first_image))

0.0 0.89000225

AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

num_classes = 4
model = tf.keras.Sequential([
  tf.keras.layers.Rescaling(1./255),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(num_classes)
])

model.compile(
  optimizer='adam',
  loss=tf.losses.SparseCategoricalCrossentropy(from_logits=True),
  metrics=['accuracy'])

info=model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=15
)

紀元 1/15
13/13 [===============================] - 3s 129ms/step - loss: 1.4904 - accuracy: 0.3636 - val_loss ：1.0347 - val_accuracy：0.5408
紀元 2/15
13/13 [==============================] - 1s 50ms/步 - 損失：0.9531 - 准確度：0.6086 - val_loss ：1.0026 - val_accuracy：0.5204
時代 3/15
13/13 [===============================] - 1s 48ms/step - loss: 0.7760 - accuracy: 0.7348 - val_loss ：0.9383 - val_accuracy：0.5612
時代 4/15
13/13 [===============================] - 1s 48ms/step - loss: 0.6378 - accuracy: 0.7778 - val_loss ：0.5954 - val_accuracy：0.7653
5/15 紀元
13/13 [==============================] - 1s 47ms/step - loss: 0.4761 - accuracy: 0.8561 - val_loss ：0.5973 - val_accuracy：0.7347
時代 6/15
13/13 [==============================] - 1s 48ms/步 - 損失：0.3942 - 准確度：0.8586 - val_loss ：0.5779 - val_accuracy：0.7551
7/15 紀元
13/13 [==============================] - 1s 48ms/step - loss: 0.3948 - accuracy: 0.8662 - val_loss ：0.6813 - val_accuracy：0.7449
時代 8/15
13/13 [==============================] - 1s 48ms/步 - 損失：0.2844 - 准確度：0.9015 - val_loss ：0.7343 - val_accuracy：0.7143
紀元 9/15
13/13 [===============================] - 1s 48ms/step - loss: 0.2735 - accuracy: 0.8965 - val_loss ：0.6145 - val_accuracy：0.8061
時代 10/15
13/13 [==============================] - 1s 49ms/步 - 損失：0.1997 - 准確度：0.9217 - val_loss ：0.7033 - val_accuracy：0.7347
時代 11/15
13/13 [===============================] - 1s 47ms/step - loss: 0.1365 - accuracy: 0.9444 - val_loss ：0.8867 - val_accuracy：0.6939
時代 12/15
13/13 [==============================] - 1s 47ms/step - loss: 0.1674 - accuracy: 0.9394 - val_loss ：0.6450 - val_accuracy：0.7449
時代 13/15
13/13 [==============================] - 1s 47ms/步 - 損失：0.1869 - 准確度：0.9293 - val_loss ：0.7596 - val_accuracy：0.7653
時代 14/15
13/13 [===============================] - 1s 47ms/step - loss: 0.1176 - accuracy: 0.9646 - val_loss ：0.7559 - val_accuracy：0.7755
時代 15/15
13/13 [==============================] - 1s 47ms/step - loss: 0.0532 - accuracy: 0.9848 - val_loss : 0.7178 - val_accuracy: 0.7857\

model.summary()

model總結

accuracy = info.history['accuracy']
val_accuracy  = info.history['val_accuracy']
loss = info.history['loss']
val_loss = info.history['val_loss']

plt.figure(figsize=(15,10))
plt.subplot(2, 2, 1)
plt.plot(accuracy, label = "Training accuracy")
plt.plot(val_accuracy, label="Validation accuracy")
plt.legend()
plt.title("Training vs validation accuracy")
plt.subplot(2,2,2)
plt.plot(loss, label = "Training loss")
plt.plot(val_loss, label="Validation loss")
plt.legend()
plt.title("Training vs validation loss")
plt.show()

圖表

pred = model.predict(test_ds)

>
y_pred = np.argmax(pred, axis=1)

y_pred_class={0: 'basketball',
1: 'soccerball',
2: 'volleyball',
3: 'waterpoloball'}

y_pred = list(map(lambda x: y_pred_class[x], y_pred))

y_true = test_generator.classe

y_true = list(map(lambda x: y_pred_class[x], y_true))

from sklearn.metrics import classification_report
print('Confusion Matrix:')
print(confusion_matrix(y_true, y_pred))
print('\n\n Classification Report:')
print(classification_report(y_true, y_pred))

分類報告和混淆矩陣

Answer 1

您可以嘗試很多事情。 最有效的方法之一是使用可調節的學習率。 這可以通過使用 Keras callabck ReduceLROnPlateau 來完成。 文檔在這里。 您想設置回調來監控驗證丟失。 如果“耐心”時期的損失未能改善，那么學習率將降低 new lr=old lr * factor 其中 factor 是介於 0 和 1 之間的浮點數。我用於此回調的設置顯示在下面的代碼

rlronp=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.5,
                                             patience=1, verbose=1)

我還建議使用 Keras 回調 EarlyStopping，文檔在這里。 設置此回調以監控驗證丟失。 如果驗證損失未能改善“耐心”的時期數，訓練將停止。 設置參數 restore_best_weights=True 以便如果此回調停止訓練，它會加載您的 model 與來自具有最低驗證損失的時期的權重。 我用於此回調的代碼如下所示

estop=tf.keras.callbacks.EarlyStopping( monitor="val_loss", patience=4, verbose=1,
                                        restore_best_weights=True)

現在您需要在 model.fit 中使用這些回調，因此在 model.fit 中包括

callbacks=[rlronp, estop]

試試這個，看看你是否能得到更好的結果。 您可能需要考慮進行遷移學習，而不是創建自己的 model。 對於圖像分類，我喜歡使用 EfficientNetB3 model。 下面的代碼顯示了我的實現

base_model=tf.keras.applications.EfficientNetB3(include_top=False, weights="imagenet",input_shape=img_shape, pooling='max') 
x=base_model.output
x=keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 )(x)
x = Dense(256, kernel_regularizer = regularizers.l2(l = 0.016),activity_regularizer=regularizers.l1(0.006),
                bias_regularizer=regularizers.l1(0.006) ,activation='relu')(x)
x=Dropout(rate=.45, seed=123)(x)        
output=Dense(num_classes, activation='softmax')(x)
model=Model(inputs=base_model.input, outputs=output)

注意 EfficientNet 期望圖像像素在 0 到 255 的范圍內，因此不要重新縮放圖像像素。

開發好的 model 的另一個重要因素是在平衡的訓練集上對其進行訓練。 我的意思是您希望各個類具有大約相等數量的圖像樣本。 以對狗和貓圖像進行分類為例。 假設您有 900 張貓圖像，而只有 100 張狗圖像。 您的分類器將傾向於預測貓。 如果它總是預測 cat 它將有 90% 的准確率。 您可以做幾件事。 一種是欠采樣，只使用 200 張貓圖像。 另一種是使用增強來增加狗樣本的數量。

如何增加/提高分類報告的准確率、召回率和 F1-Score？

問題描述

1 個解決方案

解決方案1
0 2021-11-30 06:02:10

如何增加/提高分類報告的准確率、召回率和 F1-Score？

問題描述

1 個解決方案

解決方案1 0 2021-11-30 06:02:10

解決方案1
0 2021-11-30 06:02:10