简体繁体 English

通过训练小数据子集可以验证深度学习模型吗？

[英]Is it possible to validate a deep learning model by training small data subset?

原文 2019-09-14 15:55:19 8 2 python/ tensorflow/ keras/ resnet/ vgg-net

I am looking to train a large model (resnet or vgg) for face identification. 我正在寻找训练人脸识别的大型模型（resnet或vgg）。

Is it valid strategy to train on few faces (1..3) to validate a model? 在少数面孔（1..3）上进行训练以验证模型是否有效？

In other words - if a model learns one face well - is it evidence that the model is good for the task? 换句话说，如果一个模型学得好的话，是否可以证明该模型可以很好地完成任务？

point here is that I don't want to spend a week of GPU expensive time only to find out that my model is no good or data has errors or my TF coding has a bug 这里的意思是我不想花一个星期的GPU昂贵的时间只是为了发现我的模型不好或者数据有错误或者我的TF编码有错误

2 个解决方案

Short answer: No, because Deep Learning works well on huge amount of data. 简短答案：不，因为深度学习在海量数据上运行良好。

Long answer: No. The problem is that learning only one face could overfit your model on that specific face, without learning features not present in your examples. 长答案：不。问题在于，仅学习一张面孔可能会在特定面孔上使模型过拟合，而没有学习示例中未提供的功能。 Because for example, the model has learn to detect your face thanks to a specific, very simple, pattern in that face (that's called overfitting ). 例如，因为该模型学会了检测到您的面部，这要归功于该面部中特定的，非常简单的模式（这称为过拟合 ）。

Making a stupid simple example, your model has learn to detect that face because there is a mole on your right cheek, and it has learn to identify it 举一个愚蠢的简单例子，您的模型学会了检测该脸，因为您的右脸颊上有痣，并且学会了识别它。

To make your model perform well on the general case, you need an huge amount of data, making your model capable to learn different kind of patterns 为了使模型在一般情况下表现良好，您需要大量的数据，从而使模型能够学习不同类型的模式

Suggestion: Because the training of a deep neural network is a time consuming task, usually one does not train one single neural network at time, but many neural network are trained in parallel, with different hyperparameters (layers, nodes, activation functions, learning rate, etc). 建议：由于深度神经网络的训练是一项耗时的任务，因此通常不会一次训练一个神经网络，而是并行训练许多神经网络，并具有不同的超参数（层，节点，激活函数，学习率）等）。

Edit because of the discussion below: 根据以下讨论进行编辑：

If your dataset is small is quite impossible to have a good performance on the general case, because the neural network will learn the easiest pattern , which is usually not the general/better one. 如果您的数据集很小，则在一般情况下很难获得良好的性能，因为神经网络将学习最简单的模式 ，通常不是一般/更好的模式。

Adding data you force the neural network to extract good patterns, that work on the general case. 添加数据会迫使神经网络提取适用于一般情况的良好模式。

It's a tradeoff, but usually a training on a small dataset would not lead to a good classifier on the general case 这是一个折衷，但是通常在较小的数据集上进行训练不会导致对一般情况进行良好的分类

edit2: refrasing everything to make it more clear. edit2：重新整理所有内容以使其更加清晰。 A good performance on a small dataset don't tell you if your model when trained on all the dataset is a good model. 在小型数据集上获得良好的性能并不能告诉您，在所有数据集上进行训练后，您的模型是否是好的模型。 That's why you train to the majority of your dataset and test/validate on a smaller dataset 这就是为什么要训练大部分数据集并在较小的数据集上进行测试/验证的原因

For face recognition, usually a siamese net or triplet loss are used. 对于脸部识别，通常使用暹罗色或三连体丢失。 This is an approach for one-shot learning. 这是一种一次性学习的方法。 Which means it could perform really well given only few examples per class (person face here), but you still need to train it on many examples (different person faces). 这意味着在每个班级仅提供几个示例（此处为人脸）的情况下，它的效果确实很好，但是您仍然需要针对许多示例（不同的人脸）进行培训。 See for example: 参见例如：
https://towardsdatascience.com/one-shot-learning-with-siamese-networks-using-keras-17f34e75bb3d https://towardsdatascience.com/one-shot-learning-with-siamese-networks-using-keras-17f34e75bb3d

You wouldn't train your model from scratch but use a pretrained model anyways and fine-tune it for your task 您不会从头开始训练模型，但是无论如何都要使用预训练的模型并针对您的任务进行微调

You could also have a look at pretrained face recognition models for better results like facenet 您还可以查看经过预训练的人脸识别模型，以获得更好的结果，例如facenet
https://github.com/davidsandberg/facenet https://github.com/davidsandberg/facenet