Suppose I have very simple machine learning model as follows:
from sklearn.datasets import load_diabetes
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.linear_model import RidgeCV
from sklearn.svm import LinearSVR
from sklearn.ensemble import RandomForestRegressor, StackingRegressor
X, y = load_diabetes(return_X_y=True)
estimators = [
('lr', RidgeCV()),
('svr', LinearSVR(random_state=42))
]
reg = StackingRegressor(
estimators=estimators,
final_estimator=RandomForestRegressor(n_estimators=10,
random_state=42)
)
steps = [
("preprocessing", StandardScaler()),
("regression", reg)
]
pipe = Pipeline(steps)
What I would like to do is to store the whole model parameter as json file. By that I mean the following information is saved in json file.
Pipeline(steps=[('preprocessing', StandardScaler()),
('regression',
StackingRegressor(estimators=[('lr',
RidgeCV(alphas=array([ 0.1, 1. , 10. ]))),
('svr',
LinearSVR(random_state=42))],
final_estimator=RandomForestRegressor(n_estimators=10,
random_state=42)))])
When I use json.dumps(pipe)
I face with error that Object of type Pipeline is not JSON serializable
. Any idea how one can do that?
The output you've given is just the string representation, so str(pipe)
will produce that. It has a bit of formatting for spacing and newlines, so perhaps not ideal for storage.
You could use pipe.get_params()
to retrieve a dictionary of parameters. That will include all the default parameters unlike the string; you could use the private method _changed_params
from sklearn.utils._pprint
( source ), but being private you'd need to be careful about breaking changes. There might be a more public method along the lines of pretty-printing and display of estimators that gives you what you need.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.