How do I create a transformer call Dictifier()
that encapsulates converting a DataFrame using .to_dict("records")
as I want to use it in a pipeline
.
I want to be able to incorporate the function inside pipeline like this:
pipeline = Pipeline([
("featureunion", numeric_categorical_union),
("dictifier", Dictifier()),
("vectorizer", DictVectorizer(sort=False)),
("clf", xgb.XGBClassifier(max_depth = 3))
])
Try this leaning on a pipeline of sklearn (comment if you are using another kind of pipeline object). You should build a class with two methods: fit and transform for each specific step you want to create for your pipeline.
import pandas as pd
from sklearn.pipeline import make_pipeline
from sklearn.base import BaseEstimator, TransformerMixin
df = pd.DataFrame({'x': [1,2,3],
'y': [4,5,6]})
class dataframe2dict(BaseEstimator, TransformerMixin):
def __init__(self):
pass
def fit(self):
"""Mock method"""
return self
def transform(self, df: pd.DataFrame):
return df.to_dict()
pipeline = make_pipeline(dataframe2dict())
pipeline.transform(df)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.