简体   繁体   中英

How can I test the model I created with Keras

I was working on a text classification problem with Keras. But I tried to test the model I created, but I cannot use the TfidfVectorizer to test the class.

with open('model_architecture.json', 'r') as f:
model = model_from_json(f.read())

model.load_weights('model_weights.h5')

After installing the model I have prepared a test list to use.

test_data=["sentence1","sentence2","sentence3"]

No problem so far

But..

tf=TfidfVectorizer(binary=True)
train=tf.fit_transform(test_data)
test=tf.transform(test_data)
print(model.predict_classes(test))

ValueError: Error when checking input: expected dense_1_input to have shape (11103,) but got array with shape (92,)

I get such an error

And I also tried

tf=TfidfVectorizer(binary=True)
test=tf.transform(test_data)

sklearn.exceptions.NotFittedError: TfidfVectorizer - Vocabulary wasn't fitted.

but I have received such an error, I learned that the fit () method should come before this can not be used.

But I still can't test the model I'm training

You need to encode your test data using the exact same TfIdfVectorizer object you fit and used to transform the original training data, way back when you originally trained the model. If you fit a different TfidfVectorizer to your test data then the encoding (including the vocab length) will be completely different and it will not work. It is this difference in vocab length that is the proximate cause of the error you're seeing. However, even if you do get the dimensions match purely by chance, it still won't work because the model was trained with an encoding that maps "cat" to 42, or whatever, while you're testing it with an encoding that maps "cat" to 13 or something. You'd basically be feeding it scrabbled nonsense. There really is no alternative but to go and get the original TfidfVectorizer, or at least to fit a TfidfVectorizer to the exact same documents with the exact same configuration. If this is not possible, then you'll simply have to train a new model and this time remember to save off the TfidfVectorizer as well.

Normally the fitted preprocessing are saved to a pickle file via pickle.dump() during the initial training, and loading with pickle.load() for testing and production, similar to what you did for model_architecture.json and model_weights.hd5 . It is also convenient to put everything together into an sklearn pipeline so you only have to pickle one object, but I'm not sure how works together with the Keras model.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM