简体   繁体   English

使用 BERT 编码器的二元分类模型准确率高达 50%

[英]Binary classification model using BERT encoder stuck at 50% accuracy

I'm trying to train a simple model for the Yelp binary classification task.我正在尝试为 Yelp 二进制分类任务训练一个简单的模型。

Load BERT encoder:加载 BERT 编码器:

gs_folder_bert = "gs://cloud-tpu-checkpoints/bert/keras_bert/uncased_L-12_H-768_A-12"
bert_config_file = os.path.join(gs_folder_bert, "bert_config.json")
config_dict = json.loads(tf.io.gfile.GFile(bert_config_file).read())
bert_config = bert.configs.BertConfig.from_dict(config_dict)
_, bert_encoder = bert.bert_models.classifier_model(
    bert_config, num_labels=2)
checkpoint = tf.train.Checkpoint(model=bert_encoder)
checkpoint.restore(
    os.path.join(gs_folder_bert, 'bert_model.ckpt')).assert_consumed()

Load data:加载数据:

data, info = tfds.load('yelp_polarity_reviews', with_info=True, batch_size=-1, as_supervised=True)
train_x_orig, train_y_orig = tfds.as_numpy(data['train'])
train_x = encode_examples(train_x_orig)
train_y = train_y_orig 

Use BERT to embed the data:使用 BERT 嵌入数据:

encoder_output = bert_encoder.predict(train_x)

Setup the model:设置模型:

inputs = keras.Input(shape=(768,))
x = keras.layers.Dense(64, activation='relu')(inputs)
x = keras.layers.Dense(8, activation='relu')(x)
outputs = keras.layers.Dense(1, activation='sigmoid')(x)
model = keras.Model(inputs=inputs, outputs=outputs)
sgd = SGD(lr=0.0001)
model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy'])

Train:火车:

model.fit(encoder_output[0], train_y, batch_size=64, epochs=3)
# encoder_output[0].shape === (10000, 1, 768)
# y_train.shape === (100000,)

Training results:训练结果:

Epoch 1/5
157/157 [==============================] - 1s 5ms/step - loss: 0.6921 - accuracy: 0.5455
Epoch 2/5
157/157 [==============================] - 1s 5ms/step - loss: 0.6918 - accuracy: 0.5455
Epoch 3/5
157/157 [==============================] - 1s 5ms/step - loss: 0.6915 - accuracy: 0.5412
Epoch 4/5
157/157 [==============================] - 1s 5ms/step - loss: 0.6913 - accuracy: 0.5407
Epoch 5/5
157/157 [==============================] - 1s 5ms/step - loss: 0.6911 - accuracy: 0.5358

I tried different learning rates, but the main issue seems that training takes 1 second and the accuracy stays at ~0.5.我尝试了不同的学习率,但主要问题似乎是训练需要 1 秒并且准确率保持在 ~0.5。 Am I not setting the inputs/model correctly?我没有正确设置输入/模型吗?

Your BERT model is not training.您的 BERT 模型未在训练。 It has to be placed before dense layers and train as part of the model.它必须放置在密集层之前,并作为模型的一部分进行训练。 the input layer has to take not BERT vectors, but the sequence of tokens cropped to max_length and padded.输入层不必采用 BERT 向量,而是采用裁剪为 max_length 并填充的标记序列。 Here is the example code: https://keras.io/examples/nlp/text_extraction_with_bert/ , see the beginning of create_model function.下面是示例代码: https : //keras.io/examples/nlp/text_extraction_with_bert/ ,见create_model函数的开头。

Alternatively, you can use Trainer from transformers.或者,您可以使用 Transformers 中的Trainer

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM