简体   繁体   English

在pyspark中使用带有辍学的Keras序列化模型

[英]Using Keras serialized model with dropout in pyspark

I have several neural networks built using Keras that I used so far mostly in Jupyter. 我有几个使用Keras构建的神经网络,到目前为止,我大多数时候都在Jupyter中使用它。 I often save models from scikit-learn with joblib and Keras with json + hdf5 and use them in other notebooks without issue. 我经常使用joblib从scikit-learn中保存模型,并使用json + hdf5从Keras中保存模型,并在其他笔记本中使用它们而不会出现问题。

I made a Python Spark application that can make use of those serialized models in cluster mode. 我制作了一个Python Spark应用程序,可以在集群模式下使用那些序列化的模型。 joblib models are working fine however, I encountered an issue with Keras. joblib模型运行正常,但是,我遇到了Keras的问题。

Here is the model used in notebook and pyspark: 这是用于笔记本和pyspark的模型:

def build_gru_model():
    model = Sequential()
    model.add(Embedding(max_nb_words, 128, input_length=max_sequence_length, dropout=0.2))
    model.add(GRU(128, dropout_W=0.2, dropout_U=0.2))
    model.add(Dense(2, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

both called the same way: 两者都以相同的方式调用:

preds = model.predict_proba(data, verbose=0)

However, only in Spark I get the error: 但是,仅在Spark中,我得到了错误:

MissingInputError: ("An input of the graph, used to compute DimShuffle{x,x,x,x}(keras_learning_phase), was not provided and not given a value.Use the Theano flag exception_verbosity='high',for more information on this error.", keras_learning_phase)

I've done the mandatory search and found: https://github.com/fchollet/keras/issues/2430 which points to https://keras.io/getting-started/faq/ 我已经完成了强制搜索,发现: https : //github.com/fchollet/keras/issues/2430指向https://keras.io/getting-started/faq/

If I indeed remove dropout from my model, it works. 如果我确实从模型中删除了辍学,那就行得通。 However, I fail to understand how to implement something that would allow me to keep dropout during the training phase like described in the FAQ. 但是,我无法理解如何实施一些使我在培训阶段保持辍学的方法,如常见问题解答中所述。

Based on the model code, how one would accomplish this? 根据模型代码,如何做到这一点?

You can try to put (before your prediction) 您可以尝试放置(在进行预测之前)

import keras.backend as K
K.set_learning_phase(0)

It should set your learning phase to 0 (test time) 它应该将您的学习阶段设置为0(测试时间)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在Keras功能模型中添加Dropout? - How to add Dropout in Keras functional model? 如何在 Keras 模型中停用使用 training=True 调用的 dropout 层? - How to deactivate a dropout layer called with training=True in a Keras model? Monte Carlo Dropout 作为 tf.keras.Model 的子类实现 - Monte Carlo Dropout implemented as subclass of tf.keras.Model 如何使用功能 keras API 在预训练的非序列 model 中的激活层之后插入丢失层? - How to insert dropout layers after activation layers in a pre-trained non-sequential model using functional keras API? 在 colab 崩溃 model 中不使用recurrent_dropout? - Not using recurrent_dropout in colab crashing model? 如何使用Keras在Dense层中使用Dropout创建自动编码器 - How to create autoencoder using dropout in Dense layers using Keras 如何使用Keras API在TensorFlow Eager中将压差应用于RNN的输出? - How to apply dropout to the outputs of an RNN in TensorFlow Eager using the Keras API? Keras 中的特定丢失 - Specific Dropout in Keras 使用 Tensorflow keras.Sequential() 在预测期间激活 dropout - Activate dropout during prediction using Tensorflow keras.Sequential() Keras滤除卷积滤波器 - Keras Dropout Convolutional Filters
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM