简体   繁体   English

为什么在第一个 epoch 验证准确度高于训练准确度?

[英]Why at first epoch validation accuracy is higher than training accuracy?

I'm working with a video classification of 5 classes and using TimeDistributed CNN + RNN model.我正在处理 5 个类的视频分类,并使用 TimeDistributed CNN + RNN model。 The training dataset contains 70 videos containing 20 frames each per class.训练数据集包含 70 个视频,每个视频包含 20 帧,每个 class。 The validation dataset contains 15 videos containing 20 frames each per class.验证数据集包含 15 个视频,每个视频包含 20 帧,每个 class。 The test dataset contains 15 videos containing 20 frames each per class.测试数据集包含 15 个视频,每个视频包含 20 帧,每个 class。 The batch size I used is 64. So, in total, I'm working with 500 videos.我使用的批量大小是 64。所以,我总共处理了 500 个视频。 I compiled the model using RmsProp optimizer and categorical cross_entropy loss.我使用 RmsProp 优化器和分类 cross_entropy 损失编译了 model。

I've trained the model with 65 epochs.But I notice a strange fact that, validation accuracy gets higher than training accuracy at first epoch.However, at the rest of the epochs, the curve looks much satisfactory.我已经用 65 个 epoch 训练了 model。但我注意到一个奇怪的事实,即验证精度在第一个 epoch 时高于训练精度。但是,在 epoch 的 rest 上,曲线看起来非常令人满意。

在此处输入图像描述

My model is:我的 model 是:

model = Sequential()

input_shape=(20, 128, 128, 3)

model.add(BatchNormalization(input_shape=(20, 128, 128, 3)))

model.add(TimeDistributed(Conv2D(32, (3, 3), strides=(1, 1),activation='relu', padding='same')))
model.add(TimeDistributed(MaxPooling2D((2, 2))))
model.add(TimeDistributed(Conv2D(64, (3, 3), strides=(1, 1),activation='relu', padding='same')))
model.add(TimeDistributed(Conv2D(128, (3, 3), strides=(1, 1),activation='relu', padding='same')))
model.add(TimeDistributed(Conv2D(128, (3, 3), strides=(1, 1),activation='relu', padding='same')))
model.add(TimeDistributed(MaxPooling2D((2, 2))))
model.add(TimeDistributed(Conv2D(256, (3, 3), strides=(1, 1),activation='relu', padding='same')))
model.add(TimeDistributed(MaxPooling2D((2, 2))))

model.add(TimeDistributed(Flatten()))

model.add(LSTM(256, activation='relu', return_sequences=False))
model.add((Dense(128,activation='relu')))

model.add(Dense(5, activation='softmax'))

Can anyone tell me why validation accuracy gets higher than training accuracy at first epoch?谁能告诉我为什么验证准确率在第一个时期会高于训练准确率?

My guess is that because you only have 5 classes, by just guessing on one for all frames will give you an accuracy of 20%.我的猜测是,因为你只有 5 个类,所以只对所有帧猜测一个,就会得到 20% 的准确率。 Now you have around 32%, so slightly better.现在你有大约 32%,所以稍微好一点。

I usually don't look at the initial accuracy as the model is really bad.我通常不看初始精度,因为 model 真的很糟糕。 (actually remove the first N (in this case maybe 20/30) epochs from the plot to better show the performance). (实际上从 plot 中删除前 N 个(在这种情况下可能是 20/30)时期以更好地显示性能)。

Check the confusion matrix after the first epoch and you will probably only be good at a few classes.在第一个 epoch 之后检查混淆矩阵,你可能只擅长几门课。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM