简体   繁体   English

mlr3 中的 CV 或训练/预测

[英]CV or train/predict in mlr3

In a post "The "Cross-Validation - Train/Predict" misunderstanding" by Patrick Schratz在 Patrick Schratz 的一篇文章“交叉验证 - 训练/预测”的误解中

https://mlr-org.com/docs/cv-vs-predict/ https://mlr-org.com/docs/cv-vs-predict/

mentioned that:提到:

(a) CV is done to get an estimate of a model's performance. (a) CV 用于评估模型的性能。

(b) Train/predict is done to create the final predictions (which your boss might use to make some decisions on). (b) 训练/预测是为了创建最终预测(你的老板可能会用它来做出一些决定)。

It means in mlr3, if we are in academia, need to publish papers, we need to use the CV as we intend to compare the performance of different algorithms.这意味着在mlr3中,如果我们在学术界,需要发表论文,我们需要使用CV,因为我们打算比较不同算法的性能。 And in industry, if our plan is to train a model and then have to use again and again on industry data to make predictions, we need to use the train/predict methods provided by mlr3?而在工业中,如果我们的计划是训练一个model,然后必须一次又一次地使用工业数据进行预测,我们需要使用mlr3提供的训练/预测方法吗?

Is it something which I completely picked wrong?这是我完全选错的东西吗?

Thank you谢谢

You always need a CV if you want to make a statement about a model's performance.如果你想对模型的表现做出陈述,你总是需要一份简历。

If you want to use the model to make predictions to unknown data, do a single fit and then predict.如果要使用 model 对未知数据进行预测,请进行单次拟合,然后进行预测。

So in practice, you need both: CV + "train+predict".所以在实践中,你需要两者:CV +“train+predict”。

PS: Your post does not really fit to Stackoverflow since it is not related to a coding problem. PS:您的帖子并不适合 Stackoverflow,因为它与编码问题无关。 For statistical questions please see https://stats.stackexchange.com/ .有关统计问题,请参阅https://stats.stackexchange.com/

PS2: If you talk about a post, please include the link. PS2:如果您谈论帖子,请附上链接。 I am the author of the post in this case but most other people might not know what you are talking about;)在这种情况下,我是该帖子的作者,但大多数其他人可能不知道您在说什么;)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM