[英]Imblearn Pipeline and HyperOpt Issue
目前我正在尝试使用 SMOTE 进行过采样,然后在管道中运行我的 XGBClassifier。 出于某种原因,我无法让 HyperOpt 与 Pipeline 配合使用。
以下两个示例均运行正常:
smote = SMOTE(random_state = 42)
model = XGBClassifier(random_state = 42)
pipe = Pipeline([('smote', smote),
('model',model)])
cv = StratifiedKFold(n_splits = 5)
score = cross_val_score(pipe, X_train, y_train, cv=cv, scoring='roc_auc', n_jobs=-1).mean()
print(score)
model = XGBClassifier(random_state = 42)
def objective_pipe(params):
model.set_params(**params)
cv = StratifiedKFold(n_splits = 5)
score = cross_val_score(model, X_train, y_train, cv=cv, scoring='roc_auc', n_jobs=-1).mean()
return {'loss': -score, 'params':params, 'status':STATUS_OK}
trials = Trials()
best = fmin(fn=objective_pipe, space = params, algo=tpe.suggest, max_evals = 10, trials = trials, rstate=np.random.RandomState(42))
然而,当我将管道放入目标 function 时,我最终得到了分数的 NaN 值。
smote = SMOTE(random_state = 42)
model = XGBClassifier(random_state = 42)
pipe = Pipeline([('smote', smote),
('model',model)])
def objective_pipe(params):
pipe.set_params(**params)
cv = StratifiedKFold(n_splits = 5)
score = cross_val_score(pipe, X_train, y_train, cv=cv, scoring='roc_auc', n_jobs=-1).mean()
return {'loss': -score, 'params':params, 'status':STATUS_OK}
trials = Trials()
best = fmin(fn=objective_pipe, space = params, algo=tpe.suggest, max_evals = 10, trials = trials, rstate=np.random.RandomState(42))
也许我只是错过了一些非常简单的东西,但不确定如何解决这个问题。 欢迎任何建议/帮助/资源。
我不完全确定为什么,但我有一个类似的问题,它通过设置 njobs = 1 消失了。 我认为这与 SMOTE 无法以并行方式运行有关。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.