简体   繁体   English

插入符中的训练功能的种子设置,用于重复CV

[英]seed setting for train function in caret package for repeatedCV

I need to determine the seed setting for repeatedCV for KNN model using caret package in R. 我需要使用R中的插入包来确定KNN模型的重复CV的种子设置。

My train dataset has 12 columns and 1000 rows (column 1 in the binary response and other 11 columns are standardized predictor variables) 我的火车数据集有12列和1000行(二进制响应中的第1列,其他11列是标准化的预测变量)

How can I correctly determine the seed setting for "repeatedCV" 50-fold and 5- repeats.? 如何正确确定“ repeatedCV” 50倍和5次重复的种子设置?

Is the below seed-setting correct? 以下种子设定正确吗?

Can somebody help to understand the correct seed-setting for repeatedCV and LOOCV? 有人可以帮助您了解重复CV和LOOCV的正确种子设定吗?

Please see my code below. 请在下面查看我的代码。

set.seed(123)
seeds <- vector(mode = "list", length = 251)
for(i in 1:250) seeds[[i]] <- sample.int(1000, 11) 

## For the last model:
seeds[[251]] <- sample.int(1000, 1)

The 11 in sample.int() should be the #values of parameters. sample.int()中的11应该是#values参数。
In this case, if you want to evaluate 11 values of K for KNN in each model, then you choose 11. In details, you will have 10 models in one repeat of 10 folod CV to average out. 在这种情况下,如果要在每个模型中为KNN评估11个K值,则选择11。详细来说,您将有10个模型重复10个folod CV进行平均。 In each of the 10 models, the train() will try 11 values of K. 在10个模型中的每个模型中,train()都会尝试11个K值。
2 similar questions already have great answers. 2个类似的问题已经有了不错的答案。
Set seed parallel random forest in caret 在插入号中设置种子并行随机森林
Fully reproducible parallel models using caret 使用插入符完全可复制的并行模型

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM