简体   繁体   English

为什么 NN 训练损失会变平?

[英]Why does NN training loss flatten?

I implemented a deep learning neural network from scratch without using any python frameworks like tensorflow or keras.我从头开始实现了一个深度学习神经网络,没有使用任何 python 框架,如 tensorflow 或 keras。

The problem is no matter what i change in my code like adjusting learning rate or changing layers or changing no.问题是无论我在代码中进行什么更改,例如调整学习率或更改层或更改否。 of nodes or changing activation functions from sigmoid to relu to leaky relu, i end up with a training loss that starts with 6.98 but always converges to 3.24...节点数量或将激活函数从 sigmoid 更改为 relu 到leaky relu,我最终会得到一个从 6.98 开始但总是收敛到 3.24 的训练损失......

Why is that?这是为什么?

Please review my forward and back prop algorithms.Maybe there's something wrong in that which i couldn't identify.请查看我的前进和后退道具算法。也许我无法识别出问题。

My hidden layers use leaky relu and final layer uses sigmoid activation.我的隐藏层使用leaky relu,最后一层使用 sigmoid 激活。 Im trying to classify the mnist handwritten digits.我试图对 mnist 手写数字进行分类。

code:代码:

#FORWARDPROPAGATION #前向传播

for i in range(layers-1):
    
    cache["a"+str(i+1)]=lrelu((np.dot(param["w"+str(i+1)],cache["a"+str(i)]))+param["b"+str(i+1)])


cache["a"+str(layers)]=sigmoid((np.dot(param["w"+str(layers)],cache["a"+str(layers-1)]))+param["b"+str(layers)])

yn=cache["a"+str(layers)]
m=X.shape[1]
cost=-np.sum((y*np.log(yn)+(1-y)*np.log(1-yn)))/m

if j%10==0:
    print(cost)
    costs.append(cost)
    

#BACKPROPAGATION #反向传播

grad={"dz"+str(layers):yn-y}


for i in range(layers):
    grad["dw"+str(layers-i)]=np.dot(grad["dz"+str(layers-i)],cache["a"+str(layers-i-1)].T)/m
    

    grad["db"+str(layers-i)]=np.sum(grad["dz"+str(layers-i)],1,keepdims=True)/m
    
    if i<layers-1:
        grad["dz"+str(layers-i-1)]=np.dot(param["w"+str(layers-i)].T,grad["dz"+str(layers-i)])*lreluDer(cache["a"+str(layers-i-1)])

for i in range(layers):
    param["w"+str(i+1)]=param["w"+str(i+1)] - alpha*grad["dw"+str(i+1)]
    param["b"+str(i+1)]=param["b"+str(i+1)] - alpha*grad["db"+str(i+1)]

The implementation seems okay.实施似乎还可以。 While you could converge to the same value with different models/learning rate/hyper parameters, what's frightening is having the same starting value everytime, 6.98 in your case.虽然您可以使用不同的模型/学习率/超参数收敛到相同的值,但令人恐惧的是每次都具有相同的起始值,在您的情况下为 6.98。

I suspect it has to do with your initialisation.我怀疑这与您的初始化有关。 If you're setting all your weights initially to zero, you're not gonna break symmetry.如果您最初将所有权重设置为零,则不会破坏对称性。 That is explained here and here in adequate detail.在此处此处进行了充分详细的解释。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 训练损失不减少 - Training loss does not decrease 为什么在卷积神经网络训练过程中损失会猛增? - Why does loss skyrocket during convolutional neural net training? 为什么在训练我的网络时CrossEntropy损失不减少? - Why does the CrossEntropy loss not go down during training of my network? 为什么在Tensorflow Nan中丢失我的简单NN? - Why is the loss of my simple NN in Tensorflow nan? 为什么在NN训练中GPU使用率会偏低? - Why will GPU usage run low in NN training? 为什么我的训练损失在使用预训练权重训练 AlexNet 的最后一层时会发生振荡? - Why does my training loss oscillate while training the final layer of AlexNet with pre-trained weights? 计算损失如何与 NN 回归中的多个输出一起使用? - How does calculating the Loss work with multiple Outputs in Regression with a NN? 为什么张量流中的`tf.nn.nce_loss`无法在GPU上运行? - Why `tf.nn.nce_loss` in tensorflow cannot run on GPU? 为什么在训练拥抱面变压器 NER 模型时评估损失会增加? - Why does the evaluation loss increases when training a huggingface transformers NER model? 为什么我的 Tensorflow Keras model output 在训练时会出现奇怪的损失和准确度值? - Why does my Tensorflow Keras model output weird loss and accuracy values while training?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM