简体   繁体   中英

Can't load TF transformer model with keras.models.load_model()

I have a model trained in sagemaker (custom training job), and saved by my training script with the keras model.save() method that produces a variables directory with the weights and index, and a .pb file. The model is a TFBertForSequenceClassification from huggingface's transformer library, and according to their documentation, this model subclasses from a keras model. When I try to load the model with keras.models.load_model() however, I get the following error:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/save.py", line 187, in load_model
    return saved_model_load.load(filepath, compile, options)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 121, in load
    path, options=options, loader_cls=KerasObjectLoader)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py", line 633, in load_internal
    ckpt_options)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 194, in __init__
    super(KerasObjectLoader, self).__init__(*args, **kwargs)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py", line 130, in __init__
    self._load_all()
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 215, in _load_all
    self._layer_nodes = self._load_layers()
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 315, in _load_layers
    layers[node_id] = self._load_layer(proto.user_object, node_id)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 341, in _load_layer
    obj, setter = self._revive_from_config(proto.identifier, metadata, node_id)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 368, in _revive_from_config
    obj, self._proto.nodes[node_id], node_id)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 298, in _add_children_recreated_from_config
    obj_child, child_proto, child_id)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 298, in _add_children_recreated_from_config
    obj_child, child_proto, child_id)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 250, in _add_children_recreated_from_config
    metadata = json_utils.decode(proto.user_object.metadata)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/json_utils.py", line 60, in decode
    return json.loads(json_string, object_hook=_decode_helper)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/json/__init__.py", line 361, in loads
    return cls(**kw).decode(s)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

I'm stumped. The transformer library's own save_pretrained() method saves layer info in a .json file, but I don't see why the keras model saves would know/care about this (and I don't think that's what the issue is anyway). Any help?

It is the incompatibility of TensorFlow versions between the trained model and the TensorFlow version you are loading the model. My server had TensorFlow versions 2.6.2 and my PC has 2.4.1. after training the model on the server, when I tried to load on my PC "json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)" error comes.

  • check the TensorFlow versions on both the server and your PC. make sure they are similar.
  • after I upgrade the Tensorflow on my PC, it successfully loads the trained model

Another option is to build your own classifier with a first transformer layer and put on top of it your classifier ( and an output). Then use model.save() and tf.keras.models.load_model(model_path) in the following manner:

Important(:) -Note the usage of the first layer: Thanks to Utpal Chakraborty who contributed a solution: Isues with saving and loading tensorflow model which uses hugging face transformer model as its first layer

import tensorflow as tf
from tensorflow.keras import Model
from transformers import (
    AutoConfig,
    AutoTokenizer,
    TFAutoModel)
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dense, Input

#use GPU
gpus = tf.config.experimental.list_physical_devices('GPU')
print(gpus)
if gpus:
  try:
    # Currently, memory growth needs to be the same across GPUs
    for gpu in gpus:
      tf.config.experimental.set_memory_growth(gpu, True)
    logical_gpus = tf.config.experimental.list_logical_devices('GPU')
    print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
  except RuntimeError as e:
    # Memory growth must be set before GPUs have been initialized
    print(e)

config = AutoConfig.from_pretrained('bert-base-uncased',output_hidden_states=True, num_labels=4)

tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

transformer_layer = TFAutoModel.from_pretrained('bert-base-uncased',config=config, from_pt=False)

# optional - freeze all layers:
for layer in transformer_layer.layers:
    layer._trainable = False

input_word_ids = Input(shape=(512,), dtype=tf.int32, name="input_ids")
mask = Input(shape=(512,), dtype=tf.int32, name="attention_mask")

#note this critical call to inner model layer
embedding = transformer_layer.bert(input_word_ids, mask)[0]

#take only the CLS embedding
hidden = tf.keras.layers.Dense(768, activation='relu')(embedding[:,0,:])

out = Dense(num_labels, activation='softmax')(hidden)

#Compile model
model = Model(inputs = [input_word_ids,mask], outputs=out)
print(model.summary())

optimizer = Adam(learning_rate=5e-05)
metric = tf.keras.metrics.CategoricalAccuracy('accuracy')
model.compile(optimizer=optimizer, loss=tf.keras.losses.CategoricalCrossentropy(), metrics=[metric])

#Then fit the model
#.....

#Now save
model_dir = './tmp/model'
model.save(model_dir)

#test it:
model = tf.keras.models.load_model(model_dir)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM