简体   繁体   English

具有随机权重的神经网络无法学习

[英]Neural Network with random weights does not learn

I wanted to compare my classifier that used a VGG16 model pretrained on imagenet to how it would have performed if I did not use the imagenet weights, so I loaded the model by using 我想将使用在imagenet上预训练的VGG16模型的分类器与不使用imagenet权重的分类器进行比较,因此我通过使用

model = applications.VGG16(weights=None, include_top=False, input_shape=(img_width, img_height, 3))

According to the Keras documentation using "weights=None" results in randomly initialized weights. 根据Keras文档,使用“ weights = None”会导致随机初始化的权重。

My problem now though is that the neural network always gives the same output, even after training for multiple epochs and trying different learning rates it always predicts all images as the same class. 我现在的问题是神经网络总是提供相同的输出,即使在训练了多个时期并尝试了不同的学习率之后,神经网络也总是将所有图像预测为同一类。

I do not think the input data (images of 2 different classes) or my code is the problem, because when initializing with the imagenet weights and training on that my classifier learned very well and reached 90% accuracy on the testset. 我认为输入数据(2个不同类的图像)或我的代码不是问题,因为在使用imagenet权重进行初始化并对该类进行训练时,我的分类器学习得很好,并且在测试集上达到了90%的准确性。

What could the problem be? 可能是什么问题? Maybe the weight initializtaion? 也许体重初始化? But I don't know how to use a different Initializer when loading the model like that 但是我不知道在加载模型时如何使用其他初始化器

You are probably facing a vanishing gradient problem. 您可能正面临消失的梯度问题。

If you use relu activation function, look at "kaiming initialisation" for your weight. 如果您使用relu激活功能,请查看“ kaiming初始化”以了解您的体重。 The objective is to keep a mean of 0 and a standard deviation of 1 for your output after each layer during the forward pass. 目的是在前进过程中的每一层之后,将输出的平均值保持为0,将标准偏差保持为1。

For relu activation fuction you have to initialize with random normal distribution multiplied by the square root of 2/(number of input for the given layer). 对于relu激活功能,您必须使用随机正态分布乘以2 /(给定层的输入数)的平方根进行初始化。

weight_initialisation = random_normal * sqrt(2/(number of input for the layer))

For CNN, I think the number of input will be number of filter * number of cell in the kernel (or 5 * 5 for a [5, 5] kernel) 对于CNN,我认为输入数将是过滤器数*内核中的单元数(或[5,5]内核为5 * 5)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM