I have trained a model using LSTM, on some data I have collected. I wanted to categorise as either Canine or Feline.
I am attempting to predict a string of text like so
json_file = open('model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
# load weights into new model
loaded_model.load_weights("lstm.hd5")
print("Loaded model from disk")
text_to_predict = ['A 2‐year‐old male domestic shorthair cat was presented for a progressive history of abnormal posture, behavior, and mentation. Menace response was absent bilaterally, and generalized tremors were identified on neurological examination. A neuroanatomical diagnosis of diffuse brain dysfunction was made. A neurodegenerative disorder was suspected. Magnetic resonance imaging findings further supported the clinical suspicion. Whole‐genome sequencing of the affected cat with filtering of variants against a database of unaffected cats was performed. Candidate variants were confirmed by Sanger sequencing followed by genotyping of a control population. Two homozygous private (unique to individual or families and therefore absent from the breed‐matched controlled population) protein‐changing variants in the major facilitator superfamily domain 8 (MFSD8) gene, a known candidate gene for neuronal ceroid lipofuscinosis type 7 (CLN7), were identified. The affected cat was homozygous for the alternative allele at both variants. This is the first report of a pathogenic alteration of the MFSD8 gene in a cat strongly suspected to have CLN7.']
MAX_SEQUENCE_LENGTH = 352
MAX_NB_WORDS = 2000
tokenizer = Tokenizer(num_words=MAX_NB_WORDS, split=' ')
seq = tokenizer.texts_to_sequences(text_to_predict)
padded = pad_sequences(seq, maxlen=MAX_SEQUENCE_LENGTH)
pred = loaded_model.predict(padded)
labels = ['canine', 'feline']
print(pred, labels[np.argmax(pred)])
However, the predictions all come back the same, irrespective of what the string I choose to classify.
[[0.5212073 0.47879276]] canine
I am also unsure as to why I have to set the MAX_SEQUENCE_LENGTH to 352, as it seems my model is expecting an array of that size. Setting it to any other value returns an error of
ValueError: Error when checking input: expected embedding_1_input to have shape (352,) but got array with shape (250,)
My Model training, for reference, is done through this code.
data = pd.read_csv('data.csv')
data['Text'] = data['Text'].apply((lambda x: re.sub('[^a-zA-z0-9\s]','',x)))
MAX_NB_WORDS = 2000
embed_dim = 128
lstm_out = 196
tokenizer = Tokenizer(num_words=MAX_NB_WORDS, split=' ')
tokenizer.fit_on_texts(data['Text'].values)
X = tokenizer.texts_to_sequences(data['Text'].values)
X = pad_sequences(X)
model = Sequential()
model.add(Embedding(max_fatures, embed_dim,input_length = X.shape[1]))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(lstm_out, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(2,activation='softmax'))
model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])
print(model.summary())
# serialize model to JSON
model_json = model.to_json()
with open("model.json", "w") as json_file:
json_file.write(model_json)
print('model string has been saved')
Y = data[['canine','feline']]
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.33, random_state = 42)
print(X_train.shape,Y_train.shape)
print(X_test.shape,Y_test.shape)
batch_size = 32
model.fit(X_train, Y_train, epochs = 30, batch_size=batch_size, verbose = 2)
#save model for future use.
model.save('lstm.hd5')
Any help would be greatly appreciated :D
From your question, I understand that the Model
is predicting correctly after Training
but it is Training
Same Class
after Loading the Saved Model
.
I recently faced the same issue and the solution to this problem is to Save the Tokenizer
, with which the Model
was Trained, in a Pickle File
and Load the Pickle File
when we want to perform Predictions
after Loading the Saved Model
.
Code for Saving the Tokenizer
in a Pickle
File:
import pickle
# saving
with open('tokenizer.pickle', 'wb') as handle:
pickle.dump(tokenizer, handle, protocol=pickle.HIGHEST_PROTOCOL)
Code for Loading the Pickle File:
with open('tokenizer.pickle', 'rb') as handle:
tokenizer2 = pickle.load(handle)
In addition to the above code, Some other observations from your code are:
So, you can change the code from
X = pad_sequences(X)
to
X = pad_sequences(X, maxlen=MAX_SEQUENCE_LENGTH)
The Values of MAX_SEQUENCE_LENGTH
and MAX_NB_WORDS
should be the same before and after Loading the Model
It is recommended to perform same Data Preprocessing steps before and after Loading the Model. So, you can apply the function, (lambda x: re.sub('[^a-zA-z0-9\\s]','',x))
after Loading the Model as well.
The Code, which should work fine is mentioned below:
data = pd.read_csv('data.csv')
data['Text'] = data['Text'].apply((lambda x: re.sub('[^a-zA-z0-9\s]','',x)))
MAX_NB_WORDS = 2000
embed_dim = 128
lstm_out = 196
tokenizer = Tokenizer(num_words=MAX_NB_WORDS, split=' ')
tokenizer.fit_on_texts(data['Text'].values)
import pickle # IMPORTANT STEP
# saving
with open('tokenizer.pickle', 'wb') as handle:
pickle.dump(tokenizer, handle, protocol=pickle.HIGHEST_PROTOCOL)
X = tokenizer.texts_to_sequences(data['Text'].values)
X = pad_sequences(X, maxlen = MAX_SEQUENCE_LENGTH) # Change Number 2
model = Sequential()
model.add(Embedding(max_fatures, embed_dim,input_length = X.shape[1]))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(lstm_out, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(2,activation='softmax'))
model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])
print(model.summary())
# serialize model to JSON
model_json = model.to_json()
with open("model.json", "w") as json_file:
json_file.write(model_json)
print('model string has been saved')
Y = data[['canine','feline']]
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.33, random_state = 42)
print(X_train.shape,Y_train.shape)
print(X_test.shape,Y_test.shape)
batch_size = 32
model.fit(X_train, Y_train, epochs = 30, batch_size=batch_size, verbose = 2)
#save model for future use.
model.save('lstm.hd5')
Modified Code of the Loaded Model is shown below:
json_file = open('model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
# load weights into new model
loaded_model.load_weights("lstm.hd5")
print("Loaded model from disk")
text_to_predict = ['A 2‐year‐old male domestic shorthair cat was presented for a progressive history of abnormal posture, behavior, and mentation. Menace response was absent bilaterally, and generalized tremors were identified on neurological examination. A neuroanatomical diagnosis of diffuse brain dysfunction was made. A neurodegenerative disorder was suspected. Magnetic resonance imaging findings further supported the clinical suspicion. Whole‐genome sequencing of the affected cat with filtering of variants against a database of unaffected cats was performed. Candidate variants were confirmed by Sanger sequencing followed by genotyping of a control population. Two homozygous private (unique to individual or families and therefore absent from the breed‐matched controlled population) protein‐changing variants in the major facilitator superfamily domain 8 (MFSD8) gene, a known candidate gene for neuronal ceroid lipofuscinosis type 7 (CLN7), were identified. The affected cat was homozygous for the alternative allele at both variants. This is the first report of a pathogenic alteration of the MFSD8 gene in a cat strongly suspected to have CLN7.']
text_to_predict = text_to_predict.apply((lambda x: re.sub('[^a-zA-z0-9\s]','',x))) # CHANGE 3
MAX_SEQUENCE_LENGTH = 352
MAX_NB_WORDS = 2000
# Loading the Pickle File ==> IMPORTANT STEP
with open('tokenizer.pickle', 'rb') as handle:
tokenizer2 = pickle.load(handle)
# tokenizer = Tokenizer(num_words=MAX_NB_WORDS, split=' ') # THIS IS NOT REQUIRED
seq = tokenizer2.texts_to_sequences(text_to_predict)
padded = pad_sequences(seq, maxlen=MAX_SEQUENCE_LENGTH)
pred = loaded_model.predict(padded)
labels = ['canine', 'feline']
print(pred, labels[np.argmax(pred)])
Please reach out if these changes doesn't give you the desired output and I will Happy to help you.
Hope this helps. Happy Learning!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.