简体   繁体   English

神经网络反向传播,训练中的错误

[英]neural network-back propagation, error in training

after reading some articles about neural network(back-propagation) i try to write a simple neural network by myself. 在阅读了一些关于神经网络(反向传播)的文章后,我尝试自己编写一个简单的神经网络。

ive decided XOR neural-network, my problem is when i am trying to train the network, if i use only one example to train the network,lets say 1,1,0(as input1,input2,targetOutput). 我决定XOR神经网络,我的问题是当我试图训练网络时,如果我只使用一个例子来训练网络,那么就说1,1,0(作为input1,input2,targetOutput)。 after 500 trains +- the network answer 0.05. 500列车后+ - 网络回答0.05。 but if im trying more then one example (lets say 2 different or all the 4 possibilities) the network aims to 0.5 as output :( i searched in google for my mistakes with no results :S ill try to give as much details as i can to help find what wrong: 但是,如果我尝试更多的一个例子(让我们说2个不同或所有4种可能性),网络的目标是0.5作为输出:(我在谷歌搜索我的错误没有结果:病了尽量给我尽可能多的细节帮助找到错误:

-ive tried networks with 2,2,1 and 2,4,1 (inputlayer,hiddenlayer,outputlayer). -ive尝试使用2,2,1和2,4,1(输入层,隐藏层,输出层)的网络。

-the output for every neural defined by: - 每个神经元的输出定义为:

double input = 0.0;
        for (int n = 0; n < layers[i].Count; n++)
            input += layers[i][n].Output * weights[n];

while 'i' is the current layer and weight are all the weights from the previous layer. 而'i'是当前图层,而权重是前一图层的所有权重。

-the last layer(output layer) error is defined by: - 最后一层(输出层)错误定义如下:

value*(1-value)*(targetvalue-value);

while 'value' is the neural output and 'targetvalue' is the target output for the current neural. 而'value'是神经输出,'targetvalue'是当前神经元的目标输出。

-the error for the others neurals define by: - 其他神经元的错误定义为:

foreach neural in the nextlayer 
           sum+=neural.value*currentneural.weights[neural];

-all the weights in the network are adapt by this formula(the weight from neural -> neural 2) - 所有网络中的权重都适应这个公式(来自神经的权重 - >神经2)

weight+=LearnRate*neural.myvalue*neural2.error;

while LearnRate is the nework learning rate(defined 0.25 at my network). 而LearnRate是纽约学习率(在我的网络中定义为0.25)。 -the biasweight for each neural is defined by: - 每个神经元的偏重量由下式定义:

bias+=LearnRate*neural.myerror*neural.Bias;

bias is const value=1. bias是const值= 1。

that pretty much all i can detail, as i said the output aim to be 0.5 with different training examples :( 几乎所有我都可以详细说明,因为我说输出目标是0.5与不同的训练例子:(

thank you very very much for your help ^_^. 非常感谢你的帮助^ _ ^。

It is difficult to tell where the error is without seeing the complete code. 如果没有看到完整的代码,很难分辨出错误的位置。 One thing you should carefully check is that your calculation of the local error gradient for each unit matches the activation function you are using on that layer. 您应该仔细检查的一件事是,您对每个单元的局部误差梯度的计算与您在该层上使用的激活函数相匹配。 Have a look here for the general formula: http://www.learnartificialneuralnetworks.com/backpropagation.html . 看看这里的通用公式: http//www.learnartificialneuralnetworks.com/backpropagation.html

For instance, the calculation you do for the output layer assumes that you are using a logistic sigmoid activation function but you don't mention that in the code above so it looks like you are using a linear activation function instead. 例如,您对输出层执行的计算假定您使用的是逻辑sigmoid激活函数,但是您没有在上面的代码中提到它,所以看起来您正在使用线性激活函数。

In principle a 2-2-1 network should be enough to learn XOR although the training will sometime get trapped into a local minimum without being able to converge to the correct state. 原则上,2-2-1网络应该足以学习XOR,尽管训练有时会被困在局部最小值而不能收敛到正确的状态。 So it is important not to draw conclusion about the performance of your algorithm from a single training session. 因此,重要的是不要从单个训练课程中得出关于算法性能的结论。 Note that simple backprog is bound to be slow, there are faster and more robust solutions like Rprop for instance. 请注意,简单的backprog必然会很慢,例如Rprop就有更快更强大的解决方案。

There are books on the subject which provide detailed step-by-step calculation for a simple network (eg 'AI: A guide to intelligent systems' by Negnevitsky), this could help you debug your algorithm. 有关于这个主题的书籍提供了一个简单网络的详细逐步计算(例如,“人工智能:Negnevitsky的智能系统指南”),这可以帮助您调试算法。 An alternative would be to use an existing framework (eg Encog, FANN, Matlab) set up the exact same topology and initial weights and compare the calculation with your own implementation. 另一种方法是使用现有框架(例如Encog,FANN,Matlab)设置完全相同的拓扑和初始权重,并将计算与您自己的实现进行比较。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM