简体   繁体   中英

How does TensorFlow SavedModel handle additional dependencies

I am trying to export my TF model using the tf.saved_model.save() function. However, I'm using additional external libraries during preprocessing. A working code using the example of nltk looks like this:

import tensorflow as tf
import nltk
import tensorflow_datasets as tfds
from tensorflow.python.ops import gen_string_ops
from tensorflow.python.ops import string_ops

train_data, val_data, test_data = train_data, validation_data, test_data = tfds.load(
    name="imdb_reviews", 
    split=('train[:60%]', 'train[60%:]', 'test'),
    as_supervised=True)


def remove_stops(text):
    nltk.download('stopwords')
    stop_words = nltk.corpus.stopwords.words('english')
    for w in stop_words:
        text = string_ops.regex_replace(text, '\\b{}\\b'.format(w),'')  # remove all stopwords
    text = tf.strings.strip(text)  # remove trailing whitespaces
    return text

vectorize_layer = tf.keras.layers.experimental.preprocessing.TextVectorization( # this gives me a BOW representation of my data...
    max_tokens=1000,
    standardize=remove_stops,  # ...using my method to remove stopwords
    output_mode='count') 

train_feats = list(map(lambda x: x[0], train_data)) # bc PrefetchDatasets don't allow direct access
vectorize_layer.adapt(train_feats)  # BOW needs to be trained beforehand to get vector reps

# define some simple network
input_layer = tf.keras.Input(shape=(), name='input_text', dtype=tf.string)
model = tf.keras.Sequential()
model.add(input_layer)
model.add(vectorize_layer)
model.add(tf.keras.layers.Dense(units=64, activation='softmax')) 
model.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))
model.compile(optimizer='adam',loss='binary_crossentropy', metrics = ['binary_accuracy'])
model.fit(train_data.shuffle(10000).batch(512),
             epochs=2,
             validation_data=val_data.batch(512))

tf.saved_model.save(model, './save_here')

As far as I've understood the docs , a savedModel includes the trained parameters and computations, but NOT my code. But when I load the same model in another script like so (ie without importing my external library at all)

import tensorflow as tf
loaded_model = tf.saved_model.load('./save_here')
loaded_model(tf.constant(['test me pls']))

I get my output without any errors

<tf.Tensor: shape=(1, 1), dtype=float32, numpy=array([[0.5086063]], dtype=float32)>

This is especially baffling to me as I am even using nltk to download a dataset of stopwords. How exactly do SavedModels deal with these external libraries during exportation?

Actually, you are not using NLTK libraries during inferencing from the saved model.

Your model only needs text as vectors to work, that are handled by the vectorize_layer in your code. if you want to preprocess your text before inferencing you should import libraries and preprocess it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM