What method does the sklearn VotingClassifier fit use?

Question

The official document does not seems to provide the info.

I am wondering why we can't provide the VotingClassifier the already trained models so we do not need to train again since the VotingClassifier require us to call the fit method before predicting.

Does it just do:

for clf in self.clfs:
    clf.fit(X, y)

or does it use some more interesting folding method?

Answer 1

Here's what VotingClassifier.fit does:

def fit(self, X, y, sample_weight=None):
    ...  # Validates the arguments, estimators, etc.

    self.le_ = LabelEncoder()
    self.le_.fit(y)
    self.classes_ = self.le_.classes_
    self.estimators_ = []

    transformed_y = self.le_.transform(y)

    self.estimators_ = Parallel(n_jobs=self.n_jobs)(
            delayed(_parallel_fit_estimator)(clone(clf), X, transformed_y,
                sample_weight)
                for _, clf in self.estimators)

    return self

... where _parallel_fit_estimator is just a wrapper over estimator.fit call:

def _parallel_fit_estimator(estimator, X, y, sample_weight):
    if sample_weight is not None:
        estimator.fit(X, y, sample_weight)
    else:
        estimator.fit(X, y)
    return estimator

As you can see, the method indeed fits the classifiers (in parallel!) and creates the label encoder self.le_ and self.estimators_ attributes. The predict() or transform() methods are built on top of these attributes, that's why calling fit() first is necessary.

What method does the sklearn VotingClassifier fit use?

Question

1 answers

solution1
0 ACCPTED 2018-01-01 14:15:02

What method does the sklearn VotingClassifier fit use?

Question

1 answers

solution1 0 ACCPTED 2018-01-01 14:15:02

solution1
0 ACCPTED 2018-01-01 14:15:02