Function that converts DataFrame to Dictionary

Question

How do I create a transformer call Dictifier() that encapsulates converting a DataFrame using .to_dict("records") as I want to use it in a pipeline .

I want to be able to incorporate the function inside pipeline like this:

pipeline = Pipeline([
                     ("featureunion", numeric_categorical_union),
                     ("dictifier", Dictifier()),
                     ("vectorizer", DictVectorizer(sort=False)),
                     ("clf", xgb.XGBClassifier(max_depth = 3))
                     ])

Answer 1

Try this leaning on a pipeline of sklearn (comment if you are using another kind of pipeline object). You should build a class with two methods: fit and transform for each specific step you want to create for your pipeline.

import pandas as pd
from sklearn.pipeline import make_pipeline
from sklearn.base import BaseEstimator, TransformerMixin

df = pd.DataFrame({'x': [1,2,3],
                   'y': [4,5,6]})

class dataframe2dict(BaseEstimator, TransformerMixin):

    def __init__(self):
        pass

    def fit(self):
        """Mock method"""
        return self

    def transform(self, df: pd.DataFrame):
        return df.to_dict()


pipeline = make_pipeline(dataframe2dict())

pipeline.transform(df)

Function that converts DataFrame to Dictionary

Question

1 answers

solution1
0 ACCPTED 2020-06-09 16:35:40

Function that converts DataFrame to Dictionary

Question

1 answers

solution1 0 ACCPTED 2020-06-09 16:35:40

solution1
0 ACCPTED 2020-06-09 16:35:40