[英]How to save a custom transformer in sklearn?
I am not able to load an instance of a custom transformer saved using either sklearn.externals.joblib.dump
or pickle.dump
because the original definition of the custom transformer is missing from the current python session. 我无法加载使用
sklearn.externals.joblib.dump
或pickle.dump
保存的自定义转换器的实例,因为当前python会话中缺少自定义转换器的原始定义。
Suppose in one python session, I define, create and save a custom transformer, it can also be loaded in the same session: 假设在一个python会话中,我定义,创建并保存自定义转换器,它也可以在同一个会话中加载:
from sklearn.base import TransformerMixin
from sklearn.base import BaseEstimator
from sklearn.externals import joblib
class CustomTransformer(BaseEstimator, TransformerMixin):
def __init__(self):
pass
def fit(self, X, y=None):
return self
def transform(self, X, y=None):
return X
custom_transformer = CustomTransformer()
joblib.dump(custom_transformer, 'custom_transformer.pkl')
loaded_custom_transformer = joblib.load('custom_transformer.pkl')
Opening up a new python session and loading from 'custom_transformer.pkl' 打开一个新的python会话并从'custom_transformer.pkl'加载
from sklearn.externals import joblib
joblib.load('custom_transformer.pkl')
raises the following exception: 引发以下异常:
AttributeError: module '__main__' has no attribute 'CustomTransformer'
The same thing is observed if joblib
is replaced with pickle
. 如果将
joblib
替换为pickle
则会观察到同样的情况。 Saving the custom transformer in one session with 使用自定义转换器保存在一个会话中
with open('custom_transformer_pickle.pkl', 'wb') as f:
pickle.dump(custom_transformer, f, -1)
and loading it in another: 并将其加载到另一个:
with open('custom_transformer_pickle.pkl', 'rb') as f:
loaded_custom_transformer_pickle = pickle.load(f)
raises the same exception. 提出了同样的例外。
In the above, if CustomTransformer
is replaced with, say, sklearn.preprocessing.StandardScaler
, then it is found that the saved instance can be loaded in a new python session. 在上面,如果用例如
CustomTransformer
替换sklearn.preprocessing.StandardScaler
,则会发现保存的实例可以在新的python会话中加载。
Is it possible to be able to save a custom transformer and load it later somewhere else? 是否有可能保存自定义变压器并在以后的其他地方加载?
sklearn.preprocessing.StandardScaler
works because the class definition is available in the sklearn package installation, which joblib
will look up when you load the pickle. sklearn.preprocessing.StandardScaler
可以工作,因为sklearn软件包安装中提供了类定义,当加载pickle时, joblib
会查找joblib
。
You'll have to make your CustomTransformer
class available in the new session, either by re-defining or importing it. 您必须在新会话中使用
CustomTransformer
类,方法是重新定义或导入它。
It works for me if I pass my transform function in sklearn.preprocessing.FunctionTranformer()
and if I save the model using dill.dump()
and dill.load
a ".pk" file. 如果我在
sklearn.preprocessing.FunctionTranformer()
传递我的转换函数,并且如果我使用dill.dump()
和dill.load
一个“.pk”文件保存模型, dill.dump()
我dill.dump()
。
Note: I have included the tranform function into a sklearn pipeline with my classifier. 注意:我已使用我的分类器将tranform函数包含到sklearn管道中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.