繁体   English   中英

在scikit-learn中重复FeatureUnion

[英]repeated FeatureUnion in scikit-learn

我正在scikit-learn中学习Pipelines和FeatureUnions,因此想知道是否可以在类中重复应用“ make_union”?

考虑以下代码:

import numpy as np
import pandas as pd
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.linear_model import LogisticRegression
import sklearn.datasets as d

class IrisDataManupulation(BaseEstimator, TransformerMixin):
    """
       Raise the matrix of feature in power
    """
    def __init__(self, power=2):
        self.power = power

    def fit(self, X, y=None):
        return self

    def transform(self, X):
        return np.power(X, self.power)

iris_data = d.load_iris()

X, y = iris_data.data, iris_data.target


# feature union:
fu = FeatureUnion(transformer_list=[('squared', IrisDataManupulation(power=2)),
                               ('third', IrisDataManupulation(power=3))])

问题是否有一种巧妙的方式来创建FeatureUnion,而无需重复相同的转换器,而是传递参数列表?

例如:

fu_new = FeatureUnion(transformer_list=[('raise_power', IrisDataManupulation(), 
                      param_grid = {'raise_power__power':[2,3]})

您可以在一个自定义的Transformer中移动所有功能。 我们可以更改您的IrisDataManupulation来处理其中的权力列表:

class IrisDataManupulation(BaseEstimator, TransformerMixin):

    def __init__(self, powers=[2]):
        self.powers = powers

    def transform(self, X):
        powered_arrays = []
        for power in self.powers:
            powered_arrays.append(np.power(X, power))

        return np.hstack(powered_arrays)

然后,您可以使用此新转换器而不是FeatureUnion:

fu = IrisDataManupulation(powers=[2,3])

注意:如果要从原始特征生成多项式特征,则建议您查看PolynomialFeatures ,除了功能之间的其他交互作用之外, 它还可以生成所需的幂。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM