简体   繁体   English

如何更改我的代码以使用 k = 5 的 k 折交叉验证

[英]how to change my code to use k fold cross validation with k = 5

I want to change my code so that instead of this part:我想更改我的代码,而不是这部分:

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=100, test_size=0.2)

train_data = X_train.copy()
train_data.loc[:, 'target'] = y_train

test_data = X_test.copy()
test_data.loc[:, 'target'] = y_test


data_config = DataConfig(
    target=['target'], #target should always be a list. Multi-targets are only supported for 
    regression. Multi-Task Classification is not implemented
    continuous_cols=train_data.columns.tolist(),
    categorical_cols=[],
    normalize_continuous_features=True
)
trainer_config = TrainerConfig(
    auto_lr_find=True,
    batch_size=64,
    max_epochs=10,

)
optimizer_config = {'optimizer':'Adam', 'optimizer_params':{'weight_decay': 0, 'amsgrad': 
False}, 'lr_scheduler':None, 'lr_scheduler_params':{}, 
'lr_scheduler_monitor_metric':'valid_loss'}

model_config = NodeConfig(
    task="classification",
    num_layers=2,
    num_trees=512,
    learning_rate=1,
    embed_categorical=True,

)
tabular_model = TabularModel(
    data_config=data_config,
    model_config=model_config,
    optimizer_config=optimizer_config,
    trainer_config=trainer_config,
)

tabular_model.fit(train=train_data, test=test_data)

pred = tabular_model.predict(test_data)

pred['prediction'] = pred['prediction'].astype(int)
pred.loc[(pred['prediction'] >= 1 )] = 1

print_metrics(test_data['target'], pred["prediction"].astype('int'), tag="Holdout")

I want to Use the K fold method with k = 5 or 10.我想使用 k = 5 或 10 的K 折法。

Thank you for your advice.感谢您的意见。 The complete code example that I have used method train_test_split is above.我使用方法train_test_split的完整代码示例在上面。

Here is an example of the k-fold method:下面是一个 k-fold 方法的例子:



import numpy as np
from sklearn.model_selection import train_test_split
from sklearn import datasets
from sklearn import svm

X, y = datasets.load_iris(return_X_y=True)
X.shape, y.shape


X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.4, random_state=0)

X_train.shape, y_train.shape

X_test.shape, y_test.shape


clf = svm.SVC(kernel='linear', C=1).fit(X_train, y_train)
clf.score(X_test, y_test)

result (in this example):结果(在本例中):

0.9666666666666667

The example is from here: https://scikit-learn.org/stable/modules/cross_validation.html该示例来自此处: https://scikit-learn.org/stable/modules/cross_validation.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM