I am able to save DNN Model in h5 format on s3. but when I import it in inference pipeline of kedro tool, I am getting blank?no predictions. I made following changes in catalog.yml file:
model:
filepath: s3://ds-kedro/cuisine-classification-model/06_models/model.h5
layer: models
type: kedro.extras.datasets.tensorflow.TensorFlowModelDataset
I made changes in nodes.py as below:
def train_model(multilabel_df: pd.DataFrame):
"""Use tokenizer to convert text to sequence and Use Deep Neural Network (DNN) to predict cuisines.
Args:
feature_table: Contains restaurant names and cuisine code
Returns:
Model
"""
tokenizer = Tokenizer(num_words=5000, lower=True)
tokenizer.fit_on_texts(multilabel_df['detailed_name'])
sequences = tokenizer.texts_to_sequences(multilabel_df['detailed_name'])
x = pad_sequences(sequences, maxlen=200)
X_train, X_test, y_train, y_test = train_test_split(x,
multilabel_df[multilabel_df.columns[1:]],
test_size=0.1,
random_state=42)
num_classes = y_train.shape[1]
max_words = len(tokenizer.word_index) + 1
maxlen = 200
model = Sequential()
model.add(Embedding(max_words, 20, input_length=maxlen))
model.add(GlobalMaxPool1D())
model.add(Dense(num_classes, activation='sigmoid'))
model.compile(loss='binary_crossentropy', metrics=['acc'])
history = model.fit(X_train, y_train,
epochs=1,
batch_size=32,
validation_split=0.3,
)
metrics = model.evaluate(X_test, y_test)
print("{}: {}".format(model.metrics_names[1], metrics[1]))
print('Predicting....')
y_pred = model.predict(X_test,verbose=1)
metric = HammingLoss(mode='multilabel', threshold=0.5)
metric.update_state(y_test, y_pred)
print("Hamming Loss is:",metric.result().numpy())
#model.save('model.h5') # creates a HDF5 file 'my_model.h5'
#return model
return dict(
model=model,
model_history=history.history,
)
I have tried different methods like I put model in return statement and pass this parameter in inference pipeline.
def inference_pipeline(model, inference_data):
pipeline code
It would be great help if somebody try to figure out whats wrong here becuase I am not getting error but also not getting any predictions(Blank values)
Hello @Rajesh this is where you should be saving your outputs via a pickle.PickleDataSet
The dataset supports several backends, it defaults to cpickle - but you can pass it additional backends like joblib
or dill
if helpful.
You can always save Keras models in Kedro using .hd5
format. You need to install the tensorflow.TensorFlowModelDataset
dataset as an extra dataset support using
pip install kedro[<specify extra dataset>]
then
Add a specification in a catalog.yml file as:
your_model:
type: tensorflow.TensorFlowModelDataset
filepath: <path to save in local/s3>/your_model.hd5
You can use your_model
in inference pipeline directly to predict.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.