简体   繁体   中英

How to save keras model in kedro

I am able to save DNN Model in h5 format on s3. but when I import it in inference pipeline of kedro tool, I am getting blank?no predictions. I made following changes in catalog.yml file:

model:
  filepath: s3://ds-kedro/cuisine-classification-model/06_models/model.h5
  layer: models
  type: kedro.extras.datasets.tensorflow.TensorFlowModelDataset

I made changes in nodes.py as below:

    def train_model(multilabel_df: pd.DataFrame):
    """Use tokenizer to convert text to sequence and Use Deep Neural Network (DNN) to predict cuisines.
    Args: 
        feature_table: Contains restaurant names and cuisine code
    Returns:
        Model
    """
    tokenizer = Tokenizer(num_words=5000, lower=True)
    tokenizer.fit_on_texts(multilabel_df['detailed_name'])
    sequences = tokenizer.texts_to_sequences(multilabel_df['detailed_name'])
    x = pad_sequences(sequences, maxlen=200)
    X_train, X_test, y_train, y_test = train_test_split(x, 
                                                    
                         multilabel_df[multilabel_df.columns[1:]], 
                                                    test_size=0.1, 
                                                    random_state=42)
    num_classes = y_train.shape[1]
    max_words = len(tokenizer.word_index) + 1
    maxlen = 200
    model = Sequential()
    model.add(Embedding(max_words, 20, input_length=maxlen))
    model.add(GlobalMaxPool1D())
    model.add(Dense(num_classes, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', metrics=['acc'])
    history = model.fit(X_train, y_train,
                    epochs=1,
                    batch_size=32,
                    validation_split=0.3,
                    )
    metrics = model.evaluate(X_test, y_test)
    print("{}: {}".format(model.metrics_names[1], metrics[1]))
    print('Predicting....')
    y_pred = model.predict(X_test,verbose=1)
    metric = HammingLoss(mode='multilabel', threshold=0.5)
    metric.update_state(y_test, y_pred)
    print("Hamming Loss is:",metric.result().numpy())
    #model.save('model.h5')  # creates a HDF5 file 'my_model.h5'
    #return model
    return dict(
        model=model,
        model_history=history.history,
    )

I have tried different methods like I put model in return statement and pass this parameter in inference pipeline.

def inference_pipeline(model, inference_data):
    pipeline code

It would be great help if somebody try to figure out whats wrong here becuase I am not getting error but also not getting any predictions(Blank values)

Hello @Rajesh this is where you should be saving your outputs via a pickle.PickleDataSet

The dataset supports several backends, it defaults to cpickle - but you can pass it additional backends like joblib or dill if helpful.

You can always save Keras models in Kedro using .hd5 format. You need to install the tensorflow.TensorFlowModelDataset dataset as an extra dataset support using

pip install kedro[<specify extra dataset>]

then

Add a specification in a catalog.yml file as:

your_model:
  type: tensorflow.TensorFlowModelDataset
  filepath: <path to save in local/s3>/your_model.hd5

You can use your_model in inference pipeline directly to predict.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM