![](/img/trans.png)
[英]Custom transformer mixin with FeatureUnion in scikit-learn
[英]Scikit-Learn, Custom Transformer: ColumnSelectTransformer
給定數據來自
%%bash
mkdir data
wget http://dataincubator-wqu.s3.amazonaws.com/mldata/providers-train.csv -nc -P ./ml-data
wget http://dataincubator-wqu.s3.amazonaws.com/mldata/providers-metadata.csv -nc -P ./ml-data
我的任務是完成以下代碼片段。
simple_cols = ['BEDCERT', 'RESTOT', 'INHOSP', 'CCRC_FACIL', 'SFF', 'CHOW_LAST_12MOS', 'SPRINKLER_STATUS', 'EXP_TOTAL', 'ADJ_TOTAL']
class ColumnSelectTransformer(BaseEstimator, TransformerMixin):
def __init__(self, columns):
self.columns = columns
def fit(self, X, y=None):
return self
def transform(self, X):
if not isinstance(X, pd.DataFrame):
X = pd.DataFrame(X)
return X[self.columns]
simple_features = Pipeline([
('cst', ColumnSelectTransformer(simple_cols)),
])
assert data['RESTOT'].isnull().sum() > 0
assert not np.isnan(simple_features.fit_transform(data)).any()
完成這個斷言檢查
assert data['RESTOT'].isnull().sum() > 0
assert not np.isnan(simple_features.fit_transform(data)).any()
但是我無法通過斷言測試。 下面附上我的嘗試:
simple_cols = ['BEDCERT', 'RESTOT', 'INHOSP', 'CCRC_FACIL', 'SFF', 'CHOW_LAST_12MOS', 'SPRINKLER_STATUS', 'EXP_TOTAL', 'ADJ_TOTAL']
class ColumnSelectTransformer(BaseEstimator, TransformerMixin):
def __init__(self, columns):
self.columns = columns
def fit(self, X, y=None):
return self
def transform(self, X):
if not isinstance(X, pd.DataFrame):
X = pd.DataFrame(X)
X.dropna(inplace=True)
return X[self.columns].values()
simple_features = Pipeline([
('cst', ColumnSelectTransformer(simple_cols)),
])
希望能得到任何幫助,謝謝!
問題出在
return X[self.columns].values()
正確的 return 語句應該是
return X[self.columns].values
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.