![](/img/trans.png)
[英]Low validation accuracy with good training accuracy - keras imagedatagenerator flow_from_directory categorical classification
[英]Classification model produces extremely low test accuracy, although training and validation accuracies are good for multiclass classification
我正在嘗試對美國手語進行字母分類。 所以這是一個有 26 個類的多類分類任務。 我的 CNN model 提供了 84% 的訓練准確率和 91% 的驗證准確率,但測試准確率非常低 - 只有 7.7% !!!
我使用ImageDataGenerator
生成訓練和驗證數據:
datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=0.2,
width_shift_range=0.05,
height_shift_range=0.05,
shear_range=0.05,
horizontal_flip=True,
fill_mode='nearest',
validation_split=0.2)
img_height = img_width = 256
batch_size = 16
source = '/home/hp/asl_detection/train'
train_generator = datagen.flow_from_directory(
source,
target_size=(img_height, img_width),
batch_size=batch_size,
shuffle=True,
class_mode='categorical',
subset='training', # set as training data
color_mode='grayscale',
seed=42,
)
validation_generator = datagen.flow_from_directory(
source,
target_size=(img_height, img_width),
batch_size=batch_size,
shuffle=True,
class_mode='categorical',
subset='validation', # set as validation data
color_mode='grayscale',
seed=42,
)
這是我的 model 代碼:
img_rows = 256
img_cols = 256
def get_net():
inputs = Input((img_rows, img_cols, 1))
print("inputs shape:",inputs.shape)
#Convolution layers
conv1 = Conv2D(24, 3, strides=(2, 2), activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(inputs)
print("conv1 shape:",conv1.shape)
conv2 = Conv2D(24, 3, strides=(2, 2), activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)
print("conv2 shape:",conv2.shape)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv2)
print("pool1 shape:",pool1.shape)
drop1 = Dropout(0.25)(pool1)
conv3 = Conv2D(36, 3, strides=(2, 2), activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(drop1)
print("conv3 shape:",conv3.shape)
conv4 = Conv2D(36, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3)
print("conv4 shape:",conv4.shape)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv4)
print("pool2 shape:",pool2.shape)
drop2 = Dropout(0.25)(pool2)
conv5 = Conv2D(48, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(drop2)
print("conv5 shape:",conv5.shape)
conv6 = Conv2D(48, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5)
print("conv6 shape:",conv6.shape)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv6)
print("pool3 shape:",pool3.shape)
drop3 = Dropout(0.25)(pool3)
#Flattening
flat = Flatten()(drop3)
#Fully connected layers
dense1 = Dense(128, activation = 'relu', use_bias=True, kernel_initializer = 'he_normal')(flat)
print("dense1 shape:",dense1.shape)
drop4 = Dropout(0.5)(dense1)
dense2 = Dense(128, activation = 'relu', use_bias=True, kernel_initializer = 'he_normal')(drop4)
print("dense2 shape:",dense2.shape)
drop5 = Dropout(0.5)(dense2)
dense4 = Dense(26, activation = 'softmax', use_bias=True, kernel_initializer = 'he_normal')(drop5)
print("dense4 shape:",dense4.shape)
model = Model(input = inputs, output = dense4)
optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=0.00000001, decay=0.0)
model.compile(optimizer = optimizer, loss = 'categorical_crossentropy', metrics = ['accuracy'])
return model
這是訓練代碼:
def train():
model = get_net()
print("got model")
model.summary()
model_checkpoint = ModelCheckpoint('seqnet.hdf5', monitor='loss',verbose=1, save_best_only=True)
print('Fitting model...')
history = model.fit_generator(
train_generator,
steps_per_epoch = train_generator.samples // batch_size,
validation_data = validation_generator,
validation_steps = validation_generator.samples // batch_size,
epochs = 100)
# list all data in history
print(history.history.keys())
# summarize history for accuracy
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()
return model
model = train()
這是最近幾個時期的訓練日志:
Epoch 95/100
72/72 [==============================] - 74s 1s/step - loss: 0.4326 - acc: 0.8523 - val_loss: 0.2198 - val_acc: 0.9118
Epoch 96/100
72/72 [==============================] - 89s 1s/step - loss: 0.4591 - acc: 0.8418 - val_loss: 0.1944 - val_acc: 0.9412
Epoch 97/100
72/72 [==============================] - 90s 1s/step - loss: 0.4387 - acc: 0.8533 - val_loss: 0.2802 - val_acc: 0.8971
Epoch 98/100
72/72 [==============================] - 106s 1s/step - loss: 0.4680 - acc: 0.8349 - val_loss: 0.2206 - val_acc: 0.9228
Epoch 99/100
72/72 [==============================] - 85s 1s/step - loss: 0.4459 - acc: 0.8427 - val_loss: 0.2861 - val_acc: 0.9081
Epoch 100/100
72/72 [==============================] - 74s 1s/step - loss: 0.4639 - acc: 0.8472 - val_loss: 0.2866 - val_acc: 0.9191
dict_keys(['val_loss', 'loss', 'acc', 'val_acc'])
這些是 model 精度和損耗的曲線:
與訓練和驗證數據不同,我沒有使用ImageDataGenerator
來准備測試數據。 對於測試數據,我使用OpenCV
將圖像轉換為灰度,進一步進行了標准化。 在同一個循環中,我生成了圖像的相應 label 以防止任何順序不匹配。 我將圖像文件名和標簽保存在 csv 文件中。 這是代碼:
source = '/home/hp/asl_detection/test/unknown'
files = os.listdir(source)
test_data = []
rows = []
for file in files:
row = []
row.append(file)
row.append(file[6])
print(file)
row.append(ord(file[6]) - 97)
rows.append(row)
img = cv2.imread(os.path.join(source, file))
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = cv2.resize(img,(256, 256))
test_data.append(img)
test_data = np.array(test_data, dtype="float") / 255.0
print(test_data)
print(test_data.shape)
with open("/home/hp/asl_detection/test/alpha_class.csv", "w", newline="") as f:
writer = csv.writer(f)
writer.writerows(rows)
以下是 csv 的幾個元組:
此外,我重塑了測試圖像陣列以提供通道信息:
test_data = test_data.reshape((test_data.shape[0], img_rows, img_cols, 1))
最后通過從 csv 中獲取標簽來預測類別並計算測試數據的准確性:
y_proba = model.predict(test_data)
y_classes = y_proba.argmax(axis=-1)
data = pd.read_csv('/home/hp/asl_detection/test/alpha_class.csv', header=None)
original_classes = data.iloc[:, 2]
original_classes = original_classes.tolist()
y_classes = y_classes.tolist()
acc = accuracy_score(original_classes, y_classes) * 100
您能找出測試准確率如此低的原因嗎? 如果需要進一步的信息,請告訴我。
我認為你正面臨一個過度擬合的問題,驗證集誤導了你。 為了使驗證不被誤導,它必須具有相同的測試集分布,因此嘗試從相同的分布生成測試集和驗證集,也不要對驗證數據集進行數據擴充。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.