简体   繁体   English

cv.glmnet 中的训练/测试集比例来自 glmnet package in R

[英]Train/Test Set Proportion in cv.glmnet from glmnet package in R

I was just wondering what is the percentage of train and test set in cv.glmnet from glmnet package in R.我只是想知道在 R 中来自 glmnet package 的 cv.glmnet 中的训练和测试集的百分比是多少。 I have already read the glmnet package documentation and no information was included regarding the train/test set proportion.我已经阅读了 glmnet package 文档,并且没有包含有关训练/测试集比例的信息。 Please tell me if I missed something from the package documentation.如果我错过了 package 文档中的某些内容,请告诉我。 Any help would be greatly appreciated.任何帮助将不胜感激。 Thank you.谢谢你。

from the help page for ?cv.glmnet there are two parts to look at:?cv.glmnet的帮助页面中,有两个部分可供查看:

Argument nfolds参数nfolds

number of folds - default is 10. Although nfolds can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets.折叠数 - 默认为 10。虽然 nfolds 可以与样本大小一样大(留一法 CV),但不建议将其用于大型数据集。 Smallest value allowable is nfolds=3允许的最小值是 nfolds=3

And from the Values section for foldid并从foldid部分

if keep=TRUE, the fold assignments used如果 keep=TRUE,则使用的折叠分配

ie. IE。 set keep=TRUE in the function argument to access the folds afterwards在 function 参数中设置keep=TRUE以在之后访问折叠

The function will put each row in to 10 roughly equally sized groups/folds. function 会将每一行放入 10 个大致相同大小的组/折叠中。 Then it will run 10 iterations of the model, leaving one of these out each time for testing.然后它将运行 model 的 10 次迭代,每次都留出其中一个进行测试。 So its 90% train and 10% test but repeated 10 times.所以它的 90% 训练和 10% 测试但重复了 10 次。

You can supply your own folds with the foldid argument if you prefer.如果您愿意,可以使用foldid参数提供自己的折叠。 Hope that helps:)希望有帮助:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM