![](/img/trans.png)
[英]"TypeError: can't pickle _thread.RLock objects" while saving Facebook Prophet model using pickle
[英]How to log a sklearn pipeline with a Keras step using mlflow.pyfunc.log_model()? TypeError: can't pickle _thread.RLock objects
我想使用sklearn
步骤登录 MlFlow 一个sklearn
管道。
管道有 2 个步骤: sklearn
StandardScale 和sklearn
TensorFlow 模型。
我使用 mlflow.pyfunc.log_model() 作为可能的解决方案,但我有这个错误:
TypeError: can't pickle _thread.RLock objects
---> mlflow.pyfunc.log_model("test1", python_model=wrappedModel, signature=signature)
这是我的代码:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import keras
from keras import layers, Input
from keras.wrappers.scikit_learn import KerasRegressor
import mlflow.pyfunc
from sklearn.pipeline import Pipeline
from mlflow.models.signature import infer_signature
#toy dataframe
df1 = pd.DataFrame([[1,2,3,4,5,6], [10,20,30,40,50,60],[100,200,300,400,500,600]] )
#create train test datasets
X_train, X_test = train_test_split(df1, random_state=42, shuffle=True)
#scale X_train
scaler = StandardScaler()
X_train_s = scaler.fit_transform(X_train)
X_train_s = pd.DataFrame(X_train_s)
#wrap the keras model to use it inside of sklearn pipeline
def create_model(optimizer='adam', loss='mean_squared_error', s = X_train.shape[1]):
input_layer = keras.Input(shape=(s,))
# "encoded" is the encoded representation of the input
encoded = layers.Dense(25, activation='relu')(input_layer)
encoded = layers.Dense(2, activation='relu')(encoded)
# "decoded" is the lossy reconstruction of the input
decoded = layers.Dense(2, activation='relu')(encoded)
decoded = layers.Dense(25, activation='relu')(encoded)
decoded = layers.Dense(s, activation='linear')(decoded)
model = keras.Model(input_layer, decoded)
model.compile(optimizer, loss)
return model
# wrap the model
model = KerasRegressor(build_fn=create_model, verbose=1)
# create the pipeline
pipe = Pipeline(steps=[
('scale', StandardScaler()),
('model',model)
])
#function to wrap the pipeline to be logged by mlflow
class SklearnModelWrapper(mlflow.pyfunc.PythonModel):
def __init__(self, model):
self.model = model
def predict(self, context, model_input):
return self.model.predict(model_input)[:,1]
mlflow.end_run()
with mlflow.start_run(run_name='test1'):
#train the pipeline
pipe.fit(X_train, X_train_s, model__epochs=2)
#wrap the model for mlflow log
wrappedModel = SklearnModelWrapper(pipe)
# Log the model with a signature that defines the schema of the model's inputs and outputs.
signature = infer_signature(X_train, wrappedModel.predict(None, X_train))
mlflow.pyfunc.log_model("test1", python_model=wrappedModel, signature=signature)
从我搜索的内容来看,这种类型的错误似乎与线程的并发性有关。 然后它可能与 TensorFlow 相关,因为它在模型训练阶段分发代码。
但是,有问题的代码行是在训练阶段之后。 如果我删除这一行,其余的代码就可以工作了,这让我认为它发生在模型训练的并发阶段之后。 我不知道为什么在这种情况下会出现此错误。 我是初学者? 有人可以帮帮我吗? 谢谢
在python_model=wrappedModel
应该是python_model=SklearnModelWrapper()
我认为
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.