如何提高我的迁移学习 BERT 的 model 的验证和测试的准确性

Question

我训练了我的 BERT model，然后我在训练部分得到了 99%，而在验证部分我只得到了 80%，那么我怎样才能提高我的验证准确率呢？

代码：

def build_model(self, n_categories):
    input_word_ids = tf.keras.Input(shape=(self.MAX_LEN,), dtype=tf.int32, name='input_word_ids')
    input_mask = tf.keras.Input(shape=(self.MAX_LEN,), dtype=tf.int32, name='input_mask')
    input_type_ids = tf.keras.Input(shape=(self.MAX_LEN,), dtype=tf.int32, name='input_type_ids')

    # Import RoBERTa model from HuggingFace
    #roberta_model = TFRobertaModel.from_pretrained(self.MODEL_NAME, num_labels = n_categories, output_attentions = False, output_hidden_states = False)
    roberta_model = TFBertModel.from_pretrained(self.MODEL_NAME, num_labels = n_categories, output_attentions = True, output_hidden_states = True)
    
    # for layer in roberta_model.layers[:-15]:
    #   layer.trainable = False

    x = roberta_model(input_word_ids, attention_mask=input_mask, token_type_ids=input_type_ids)

    # Huggingface transformers have multiple outputs, embeddings are the first one,
    # so let's slice out the first position
    x = x[0]

    x = tf.keras.layers.Dropout(0.1)(x)
    x = tf.keras.layers.Flatten()(x)
    x = tf.keras.layers.Dense(256, activation='relu')(x)
    x = tf.keras.layers.Dense(n_categories, activation='softmax')(x)

    model = tf.keras.Model(inputs=[input_word_ids, input_mask, input_type_ids], outputs=x)
    model.compile(optimizer=tf.keras.optimizers.Adam(lr=1e-5), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

    return model

Answer 1

根据您提供的信息，您的 model 似乎是过拟合的。 在训练集上实现 99% 的准确度和在验证集上显着降低的准确度表明 model 只是简单地记忆训练数据，因此在验证集上表现不佳。

在这种情况下，我会考虑调整的前两个超参数是 epoch 数和学习率。 你最初的目标应该是在训练集和验证集上达到相似的准确率，即使它只有 80% 左右。 这通常意味着您应该减少 epoch 的数量，直到您看到大致相同的准确度。

在这个图表中，蓝线是训练 acc，红线是验证 acc，x 轴代表 epoch 数。 您可以看到训练 acc 继续下降，即使验证 acc 开始增加（警告标志所在的位置）。 理想情况下，您应该在警告下的时期停止训练。

从那里您可以开始调整 model 的其他参数，例如任何可用的优化器和正则化参数。

此外，从您的问题中不清楚您是否正在使用测试集。 建议将数据分成三个部分（训练、验证和测试）。 不过，测试数据应仅在训练期间使用，在 model 训练后独立使用。

如何提高我的迁移学习 BERT 的 model 的验证和测试的准确性

问题描述

1 个解决方案

解决方案1
0 2022-08-29 16:53:44

如何提高我的迁移学习 BERT 的 model 的验证和测试的准确性

问题描述

1 个解决方案

解决方案1 0 2022-08-29 16:53:44

解决方案1
0 2022-08-29 16:53:44