使用自定义数据训练 model 时出现 Huggingface 错误

Question

I am using the following notebook to train distilbert: https://github.com/krishnaik06/Huggingfacetransformer/blob/main/Custom_Sentiment_Analysis.ipynb我正在使用以下笔记本来训练 distilbert： https://github.com/krishnaik06/Huggingfacetransformer/blob/main/Custom_Sentiment_Analysis.ipynb

I'm using transformers==4.13.0 for the task.我正在使用 transformers==4.13.0 来完成这项任务。

When I run this code on colab:当我在 colab 上运行此代码时：

with training_args.strategy.scope():
    model = TFDistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased")

trainer = TFTrainer(
    model=model,                         # the instantiated 🤗 Transformers model to be trained
    args=training_args,                  # training arguments, defined above
    train_dataset=train_dataset,         # training dataset
    eval_dataset=test_dataset             # evaluation dataset
)

trainer.train()

I get the following error:我收到以下错误：

Some layers from the model checkpoint at distilbert-base-uncased were not used when initializing TFDistilBertForSequenceClassification: ['vocab_transform', 'vocab_layer_norm', 'vocab_projector', 'activation_13']
- This IS expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some layers of TFDistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['pre_classifier', 'classifier', 'dropout_19']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
/usr/local/lib/python3.7/dist-packages/transformers/trainer_tf.py:114: FutureWarning: The class `TFTrainer` is deprecated and will be removed in version 5 of Transformers. We recommend using native Keras instead, by calling methods like `fit()` and `predict()` directly on the model object. Detailed examples of the Keras style can be found in our examples at https://github.com/huggingface/transformers/tree/master/examples/tensorflow
  FutureWarning,
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-17-78414b52dd9d> in <module>()
      9 )
     10 
---> 11 trainer.train()

2 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in autograph_handler(*args, **kwargs)
   1145           except Exception as e:  # pylint:disable=broad-except
   1146             if hasattr(e, "ag_error_metadata"):
-> 1147               raise e.ag_error_metadata.to_exception(e)
   1148             else:
   1149               raise

TypeError: in user code:

    File "/usr/local/lib/python3.7/dist-packages/transformers/trainer_tf.py", line 704, in distributed_training_steps  *
        self.args.strategy.run(self.apply_gradients, inputs)
    File "/usr/local/lib/python3.7/dist-packages/transformers/trainer_tf.py", line 646, in apply_gradients  *
        gradients = self.training_step(features, labels, nb_instances_in_global_batch)
    File "/usr/local/lib/python3.7/dist-packages/transformers/trainer_tf.py", line 629, in training_step  *
        per_example_loss, _ = self.run_model(features, labels, True)
    File "/usr/local/lib/python3.7/dist-packages/transformers/trainer_tf.py", line 751, in run_model  *
        outputs = self.model(features, labels=labels, training=training)[:2]
    File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
        raise e.with_traceback(filtered_tb) from None

    TypeError: Exception encountered when calling layer "tf_distil_bert_for_sequence_classification" (type TFDistilBertForSequenceClassification).
    
    in user code:
    
        File "/usr/local/lib/python3.7/dist-packages/transformers/models/distilbert/modeling_tf_distilbert.py", line 813, in call  *
            loss = None if inputs["labels"] is None else self.compute_loss(inputs["labels"], logits)
        File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 919, in compute_loss  **
            y, y_pred, sample_weight, regularization_losses=self.losses)
    
        TypeError: 'NoneType' object is not callable
    
    
    Call arguments received:
      • input_ids={'input_ids': 'tf.Tensor(shape=(8, 238), dtype=int32)', 'attention_mask': 'tf.Tensor(shape=(8, 238), dtype=int32)'}
      • attention_mask=None
      • head_mask=None
      • inputs_embeds=None
      • output_attentions=None
      • output_hidden_states=None
      • return_dict=None
      • labels=tf.Tensor(shape=(8,), dtype=int32)
      • training=True
      • kwargs=<class 'inspect._empty'>

I'm using the following dataset: https://github.com/krishnaik06/Huggingfacetransformer/blob/main/SMSSpamCollection我正在使用以下数据集： https://github.com/krishnaik06/Huggingfacetransformer/blob/main/SMSSpamCollection

Please guide me as to what will best fix this situation, thanks!请指导我如何最好地解决这种情况，谢谢！

Answer 1

This code worked for me for Tensorflow version 2.8.2.此代码适用于 Tensorflow 版本 2.8.2。 As soon as I use higher version (4 and above), I get the same error as above一旦我使用更高版本（4 及更高版本），我就会收到与上述相同的错误

Answer 2

I was able to replicate the issue in Colab.我能够在 Colab 中复制该问题。 Alternatively, you can train the model using the code below.或者，您可以使用以下代码训练 model。

model = TFDistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased',
                                                              num_labels=2)

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5, epsilon=1e-08)

#compile the model
model.compile(optimizer=optimizer,loss=loss_fn,metrics=['accuracy'])

#train the model
model.fit(train_dataset.shuffle(100).batch(16),
              epochs=1,
              batch_size=16,
              validation_data=test_dataset.shuffle(100).batch(16))

Please find the gist here .请在这里找到要点。 Thank you!谢谢！

使用自定义数据训练 model 时出现 Huggingface 错误

问题描述

1 个解决方案

解决方案1
0 2022-08-15 15:24:21

解决方案2
0 2022-08-29 14:27:37

使用自定义数据训练 model 时出现 Huggingface 错误

问题描述

1 个解决方案

解决方案1 0 2022-08-15 15:24:21

解决方案2 0 2022-08-29 14:27:37

解决方案1
0 2022-08-15 15:24:21

解决方案2
0 2022-08-29 14:27:37