简体   繁体   English

分类 model 产生极低的测试准确度,尽管训练和验证准确度对多类分类有好处

[英]Classification model produces extremely low test accuracy, although training and validation accuracies are good for multiclass classification

I'm trying to do alphabet classification for American Sign Language.我正在尝试对美国手语进行字母分类。 So it's multiclass classification task with 26 classes.所以这是一个有 26 个类的多类分类任务。 My CNN model gave 84% training accuracy and 91% validation accuracy, yet test accuracy is hilariously low - only 7.7% !!!我的 CNN model 提供了 84% 的训练准确率和 91% 的验证准确率,但测试准确率非常低 - 只有 7.7% !!!

I used ImageDataGenerator to produce training and validation data:我使用ImageDataGenerator生成训练和验证数据:

datagen = ImageDataGenerator(
        rescale=1./255,
        rotation_range=0.2,
        width_shift_range=0.05,
        height_shift_range=0.05,
        shear_range=0.05,
        horizontal_flip=True,
        fill_mode='nearest',
        validation_split=0.2)

img_height = img_width = 256

batch_size = 16 
source = '/home/hp/asl_detection/train'

train_generator = datagen.flow_from_directory(
    source,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    shuffle=True,
    class_mode='categorical',
    subset='training', # set as training data
    color_mode='grayscale',
    seed=42,
    )

validation_generator = datagen.flow_from_directory(
    source,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    shuffle=True,
    class_mode='categorical',
    subset='validation', # set as validation data
    color_mode='grayscale',
    seed=42,
    ) 

This is my model code:这是我的 model 代码:

img_rows = 256
img_cols = 256

def get_net():

    inputs = Input((img_rows, img_cols, 1))
    print("inputs shape:",inputs.shape)

    #Convolution layers
    conv1 = Conv2D(24, 3, strides=(2, 2), activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(inputs)
    print("conv1 shape:",conv1.shape)
      
    conv2 = Conv2D(24, 3, strides=(2, 2), activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)
    print("conv2 shape:",conv2.shape)
    
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv2)
    print("pool1 shape:",pool1.shape)
    
    drop1 = Dropout(0.25)(pool1)

    conv3 = Conv2D(36, 3, strides=(2, 2), activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(drop1)
    print("conv3 shape:",conv3.shape)

    conv4 = Conv2D(36, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3)
    print("conv4 shape:",conv4.shape)
    
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv4)
    print("pool2 shape:",pool2.shape)
    
    drop2 = Dropout(0.25)(pool2)

    conv5 = Conv2D(48, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(drop2)
    print("conv5 shape:",conv5.shape)
    
    conv6 = Conv2D(48, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5)
    print("conv6 shape:",conv6.shape)
    
    pool3 = MaxPooling2D(pool_size=(2, 2))(conv6)
    print("pool3 shape:",pool3.shape)
    
    drop3 = Dropout(0.25)(pool3)

    #Flattening
    flat = Flatten()(drop3)

    #Fully connected layers
    dense1 = Dense(128, activation = 'relu', use_bias=True, kernel_initializer = 'he_normal')(flat)
    print("dense1 shape:",dense1.shape)
    drop4 = Dropout(0.5)(dense1)

    dense2 = Dense(128, activation = 'relu', use_bias=True, kernel_initializer = 'he_normal')(drop4)
    print("dense2 shape:",dense2.shape)
    drop5 = Dropout(0.5)(dense2)

    dense4 = Dense(26, activation = 'softmax', use_bias=True, kernel_initializer = 'he_normal')(drop5)
    print("dense4 shape:",dense4.shape)
            
    model = Model(input = inputs, output = dense4)

    optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=0.00000001, decay=0.0)

    model.compile(optimizer = optimizer, loss = 'categorical_crossentropy', metrics = ['accuracy'])

    return model

This is training code:这是训练代码:

def train():
    
    model = get_net()
    print("got model")
    model.summary()

    model_checkpoint = ModelCheckpoint('seqnet.hdf5', monitor='loss',verbose=1, save_best_only=True)
    print('Fitting model...')
    
    history = model.fit_generator(
    train_generator,
    steps_per_epoch = train_generator.samples // batch_size,
    validation_data = validation_generator, 
    validation_steps = validation_generator.samples // batch_size,
    epochs = 100)
    
    # list all data in history
    print(history.history.keys())
    # summarize history for accuracy
    plt.plot(history.history['acc'])
    plt.plot(history.history['val_acc'])
    plt.title('model accuracy')
    plt.ylabel('accuracy')
    plt.xlabel('epoch')
    plt.legend(['train', 'validation'], loc='upper left')
    plt.show()
    # summarize history for loss
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title('model loss')
    plt.ylabel('loss')
    plt.xlabel('epoch')
    plt.legend(['train', 'validation'], loc='upper left')
    plt.show() 
    
    
    return model

model = train()

This is training log for last few epochs:这是最近几个时期的训练日志:

Epoch 95/100
72/72 [==============================] - 74s 1s/step - loss: 0.4326 - acc: 0.8523 - val_loss: 0.2198 - val_acc: 0.9118
Epoch 96/100
72/72 [==============================] - 89s 1s/step - loss: 0.4591 - acc: 0.8418 - val_loss: 0.1944 - val_acc: 0.9412
Epoch 97/100
72/72 [==============================] - 90s 1s/step - loss: 0.4387 - acc: 0.8533 - val_loss: 0.2802 - val_acc: 0.8971
Epoch 98/100
72/72 [==============================] - 106s 1s/step - loss: 0.4680 - acc: 0.8349 - val_loss: 0.2206 - val_acc: 0.9228
Epoch 99/100
72/72 [==============================] - 85s 1s/step - loss: 0.4459 - acc: 0.8427 - val_loss: 0.2861 - val_acc: 0.9081
Epoch 100/100
72/72 [==============================] - 74s 1s/step - loss: 0.4639 - acc: 0.8472 - val_loss: 0.2866 - val_acc: 0.9191
dict_keys(['val_loss', 'loss', 'acc', 'val_acc'])

These are the curves for model accuracies and losses:这些是 model 精度和损耗的曲线:

在此处输入图像描述 在此处输入图像描述

I didn't use ImageDataGenerator to prepare test data, unlike training and validation data.与训练和验证数据不同,我没有使用ImageDataGenerator来准备测试数据。 For test data, I used OpenCV for converting images to grayscale, further I did normalization.对于测试数据,我使用OpenCV将图像转换为灰度,进一步进行了标准化。 In the same loop I generated the corresponding label of the image to prevent any order mismatch.在同一个循环中,我生成了图像的相应 label 以防止任何顺序不匹配。 I saved the image file names and labels in a csv file.我将图像文件名和标签保存在 csv 文件中。 Here's the code:这是代码:

source = '/home/hp/asl_detection/test/unknown'
files = os.listdir(source)
test_data = []
rows = []
for file in files:
    
    row = []
    row.append(file)
    row.append(file[6])
    print(file)
    row.append(ord(file[6]) - 97)  
    rows.append(row) 
    
    img = cv2.imread(os.path.join(source, file))
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img = cv2.resize(img,(256, 256))
    test_data.append(img)
    
test_data = np.array(test_data, dtype="float") / 255.0
print(test_data)
print(test_data.shape)

with open("/home/hp/asl_detection/test/alpha_class.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerows(rows)

Here are few tuples of the csv:以下是 csv 的几个元组:

在此处输入图像描述

Further I reshaped the test image array to give channel information:此外,我重塑了测试图像阵列以提供通道信息:

test_data = test_data.reshape((test_data.shape[0], img_rows, img_cols, 1))

Finally predicted classes and calculated accuracy on test data by fetching labels from csv:最后通过从 csv 中获取标签来预测类别并计算测试数据的准确性:

y_proba = model.predict(test_data)
y_classes = y_proba.argmax(axis=-1)
data = pd.read_csv('/home/hp/asl_detection/test/alpha_class.csv', header=None)
original_classes = data.iloc[:, 2]
original_classes = original_classes.tolist()
y_classes = y_classes.tolist()
acc = accuracy_score(original_classes, y_classes) * 100

Could you plz find the reason behind such a low test accuracy?您能找出测试准确率如此低的原因吗? If any information is needed further, plz let me know.如果需要进一步的信息,请告诉我。

I think you are facing an overfitting problem and the validation set misleads you.我认为你正面临一个过度拟合的问题,验证集误导了你。 For the validation not to be misleading it has to have the same distribution of the test set, so try to generate the test and validation sets from the same distribution, also don't do data augmentation with the validation data set.为了使验证不被误导,它必须具有相同的测试集分布,因此尝试从相同的分布生成测试集和验证集,也不要对验证数据集进行数据扩充。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 验证准确性低且训练准确度高-keras imagedatagenerator flow_from_directory类别分类 - Low validation accuracy with good training accuracy - keras imagedatagenerator flow_from_directory categorical classification 训练精度高,验证精度低 CNN二元分类 keras - High training accuracy, low validation accuracy CNN binary classification keras Keras模型为多标签图像分类提供了非常低的训练和验证精度 - Keras model giving very low training and validation accuracy for multi-label image classification 文本分类的训练和验证准确性和损失 - training and validation accuracy and loss for text classification 多类分类 model 未正确训练。 为什么训练损失是恒定的? - Multiclass Classification model not training properly. Why is the training loss constant? 训练精度很高,训练过程中损失少,但分类不好 - Very high training accuracy and low loss during training, but bad classification Keras CNN中用于多类图像分类的验证精度常数 - Validation accuracy constant in Keras CNN for multiclass image classification 用于多类分类的 ANN 模型 - ANN model for multiclass classification 如何提高随机森林多类分类模型的准确率? - How to improve accuracy of random forest multiclass classification model? 在 CNN 中训练每个 epoch 期间,验证准确度很高,但分类报告中的最终准确度非常低,这是什么意思? - validation accuracy is high during training each epoch in a CNN but final accuracy in classification report very low what does it mean?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM