[英]How to do parameter tuning in LogisticRegression using StratifiedKFold in Python?
I need to feed for example 6 C values and see the mean roc_auc_score for each 10 fold for each value of C我需要提供例如 6 个 C 值,并查看每个 C 值的每 10 倍的平均 roc_auc_score
My attempt so far:到目前为止我的尝试:
lr = LogisticRegression(C = 1,
penalty='l1',
solver='liblinear',
tol=0.0001,
max_iter=3000,
intercept_scaling=1.0,
multi_class='auto',
random_state=42)
C = [0.01,0.05,0.1,1,10,12]
final_scores = []
mean_scores = {}
# Stratified KFold
skf = StratifiedKFold(n_splits=10, random_state=42, shuffle=False)
for c in C:
for fold, (train_index, test_index) in enumerate(skf.split(X, y)):
print("Fold:" , fold +1)
X_train, X_test = X.iloc[train_index], X.iloc[test_index]
y_train, y_test = y.iloc[train_index], y.iloc[test_index]
lr.fit(X_train,y_train)
predictions = lr.predict_proba(X_train)[:,1]
final_score.append(roc_auc_score(y_train, predictions))
print("AUC SCORE:" + str(roc_auc_score(y_train, predictions)))
mean_scores[c] = np.mean(final_scores)
print("---")
print(mean_scores)
I need a resulting dictionary that as keys have c values and values have the mean of 10 fold for each c.我需要一个结果字典,因为键具有 c 值,而每个 c 的值的平均值为 10 倍。
Edit:编辑:
roc_dict = dict()
C = [0.01,0.05,0.1,1,10,12]
for c in C:
final_scores = []
mean_scores = {}
for fold, (train_index, test_index) in enumerate(skf.split(X, y)):
print("Fold:" , fold +1)
X_train, X_test = X.iloc[train_index], X.iloc[test_index]
y_train, y_test = y.iloc[train_index], y.iloc[test_index]
lr.fit(X_train,y_train)
predictions = lr.predict_proba(X_train)[:,1]
final_scores.append(roc_auc_score(y_train, predictions))
print("AUC SCORE:" + str(roc_auc_score(y_train, predictions)))
roc_dict[c] = np.mean(final_scores)
You're almost there.您快到了。 You can define an empty dict
before your loop:您可以在循环之前定义一个空dict
:
roc_dict = dict()
Run your loop, but place your list
and dict
inside so it resets every iteration (or make new ones):运行您的循环,但将您的list
和dict
放在里面,以便它重置每次迭代(或创建新的迭代):
for c in C:
final_scores = []
mean_scores = {}
# no change here, paste your original code
roc_dict[c] = final_scores # add this
It will result in this:这将导致:
Out[90]:
{0.01: [0.7194940476190477,
0.7681686046511628,
0.653343023255814,
0.6596194503171249],
0.05: [0.7194940476190477,
0.7681686046511628,
0.653343023255814,
0.6596194503171249],
0.1: [0.7194940476190477,
0.7681686046511628,
0.653343023255814,
0.6596194503171249], # ... etc. But with 10 folds instead.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.