简体繁体 English

测试集是否用于使用keras更新深度学习模型中的权重？

[英]Does the test set is used to update weight in a deep learning model with keras?

原文 2018-11-30 14:10:47 8 2 python/ keras/ deep-learning

I'm wondering if the result of the test set is used to make the optimization of model's weights. 我想知道测试集的结果是否用于优化模型权重。 I'm trying to make a model but the issue I have is I don't have many data because they are medical research patients. 我正在尝试建立模型，但是我遇到的问题是我没有太多数据，因为他们是医学研究患者。 The number of patient is limited in my case (61) and I have 5 feature vectors per patient. 在我的情况下，患者人数有限（61），每个患者有5个特征向量。 What I tried is to create a deep learning model by excluding one subject and I used the exclude subject as the test set. 我试图通过排除一个主题来创建深度学习模型，然后将排除主题用作测试集。 My problem is there is a large variability in subject features and my model fits well the training set (60 subjects) but not that good the 1 excluded subject. 我的问题是受测者特征存在很大的差异，我的模型非常适合训练集（60个受测者），但不适用于1个被排除的受测者。 So I'm wondering if the testset (in my case the excluded subject) could be used in a certain way to make converge the model to better classify the exclude subject? 因此，我想知道是否可以以某种方式使用测试集（在我的情况下为排除的主题）以使模型收敛，从而更好地对排除的主题进行分类？

2 个解决方案

You should not use the test data of your data set in your training process. 您不应在训练过程中使用数据集的测试数据。 If your training data is not enough, one approach using a lot during this days(especially for medical images) is data augmentation . 如果您的训练数据不够用，那么这几天（尤其是医学图像）经常使用的一种方法就是数据增强 。 So I highly recommend you to use this technique in your training process. 因此，我强烈建议您在训练过程中使用此技术。 How to use Deep Learning when you have Limited Data is one of the good tutorial about data augmentation. 拥有受限数据时如何使用深度学习是有关数据扩充的很好的教程之一。

No , you souldn't use your test set for training to prevent overfitting , if you use cross-validation principles you need exactly to split your data into three datasets a train set which you'll use to train your model , a validation set to test different value of your hyperparameters , and a test set to finally test your model , if you use all your data for training, your model will overfit obviously. 不，您不会使用测试集进行训练来防止过度拟合，如果您使用交叉验证原则，则需要将数据精确地分成三个数据集，一个训练集将用于训练模型，一个验证集用于测试您的超参数的不同值，以及测试集以最终测试模型，如果您使用所有数据进行训练，则模型显然会过拟合。

One thing to remember deep learning work well if you have a large and very rich datasets 如果您拥有大量且非常丰富的数据集，请记住一件事，深度学习就可以很好地工作