简体   繁体   English

神经网络收敛速度(Levenberg-Marquardt)(MATLAB)

[英]Neural Network convergence speed (Levenberg-Marquardt) (MATLAB)

I was trying to approximate a function (single input and single output) with an ANN. 我试图用ANN近似一个函数(单输入和单输出)。 Using MATLAB toolbox I could see that with 5 or more neurons in the hidden layer, I can achieve a very nice result. 使用MATLAB工具箱,我可以看到在隐藏层中有5个或更多的神经元,我可以获得非常好的结果。 So I am trying to do it manually. 所以我正在尝试手动进行。

Calculations: As the network has only one input and one output, the partial derivative of the error (e=do, where 'd' is the desired output and 'o' is the actual output) in respect to a weigth which connects a hidden neuron j to the output neuron, will be -hj (where hj is the output of a hidden neuron j); 计算:由于网络只有一个输入和一个输出,因此相对于连接隐藏节点的重量,误差的偏导数(e = do,其中“ d”是所需的输出,而“ o”是实际的输出)输出神经元的神经元j将是-hj(其中hj是隐藏神经元j的输出); The partial derivative of the error in respect to output bias will be -1; 相对于输出偏置的误差的偏导数将为-1; The partial derivative of the error in respect to a weight which connects the input to a hidden neuron j will be -woj*f'*i, where woj is the hidden neuron j output weigth, f' is the tanh() derivative and 'i' is the input value; 相对于将输入连接到隐藏神经元j的权重,误差的偏导数将是-woj * f'* i,其中woj是隐藏神经元j输出的权重,f'是tanh()导数,而'我是输入值; Finally, the partial derivative of the error in respect to hidden layer bias will be the same as above (in respect to input weight) except that here we dont have the input: -woj*f' 最后,关于隐层偏置的误差的偏导数将与上面的(关于输入权重)相同,除了这里我们没有输入:-woj * f'

The problem is: the MATLAB algorithm always converge faster and better. 问题是:MATLAB算法总是收敛得越来越快。 I can achieve the same curve as MATLAB does, but my algorithm requires much more epochs. 我可以获得与MATLAB相同的曲线,但是我的算法需要更多的时间。 I've tried to remove pre and postprocessing functions from MATLAB algorithm. 我试图从MATLAB算法中删除预处理和后处理函数。 It still converges faster. 它仍然收敛得更快。 I've also tried to create and configure the network, and extract weight/bias values before training so I could copy them to my algorithm to see if it converges faster but nothing changed (is the weight/bias initialization inside create/configure or train function?). 我也尝试过创建和配置网络,并在训练之前提取权重/偏差值,因此我可以将它们复制到算法中,以查看其收敛速度是否更快,但没有发生任何变化(权重/偏差初始化在创建/配置或训练过程中)功能?)。

Does the MATLAB algorithm have some kind of optimizations inside the code? MATLAB算法在代码内部是否进行了某种优化? Or may be this difference only in the organization of the training set and weight/bias initialization? 还是仅在训练集的组织和权重/偏向初始化方面有区别?

In case one wants to look my code, here is the main loop which makes the training: 万一要看我的代码,下面是进行训练的主循环:

Err2 = N;
epochs = 0;
%compare MSE of error2
while ((Err2/N > 0.0003) && (u < 10000000) && (epochs < 100))
    epochs = epochs+1;
    Err = 0;
    %input->hidden weight vector
    wh = w(1:hidden_layer_len);
    %hidden->output weigth vector
    wo = w((hidden_layer_len+1):(2*hidden_layer_len));
    %hidden bias
    bi = w((2*hidden_layer_len+1):(3*hidden_layer_len));
    %output bias
    bo = w(length(w));
    %start forward propagation
    for i=1:N
        %take next input value
        x = t(i);
        %propagate to hidden layer
        neth = x*wh + bi;
        %propagate through neurons
        ij = tanh(neth)';
        %propagate to output layer
        neto = ij*wo + bo;
        %propagate to output (purelin)
        output(i) = neto;
        %calculate difference from target (error)
        error(i) = yp(i) - output(i);

        %Backpropagation:

        %tanh derivative
        fhd = 1 - tanh(neth').*tanh(neth');
        %jacobian matrix
        J(i,:) = [-x*wo'.*fhd -ij -wo'.*fhd -1];

        %SSE (sum square error)
        Err = Err + 0.5*error(i)*error(i);
    end

    %calculate next error with updated weights and compare with old error

    %start error2 from error1 + 1 to enter while loop
    Err2 = Err+1;
    %while error2 is > than old error and Mu (u) is not too large
    while ((Err2 > Err) && (u < 10000000))
        %Weight update
        w2 = w - (((J'*J + u*eye(3*hidden_layer_len+1))^-1)*J')*error';
        %New Error calculation

        %New weights to propagate
        wh = w2(1:hidden_layer_len);
        wo = w2((hidden_layer_len+1):(2*hidden_layer_len));
        %new bias to propagate
        bi = w2((2*hidden_layer_len+1):(3*hidden_layer_len));
        bo = w2(length(w));
        %calculate error2
        Err2 = 0;
        for i=1:N
            %forward propagation again
            x = t(i);
            neth = x*wh + bi;
            ij = tanh(neth)';
            neto = ij*wo + bo;
            output(i) = neto;
            error2(i) = yp(i) - output(i);

            %Error2 (SSE)
            Err2 = Err2 + 0.5*error2(i)*error2(i);
        end

        %compare MSE from error2 with a minimum
        %if greater still runing
        if (Err2/N > 0.0003)
            %compare with old error
            if (Err2 <= Err)
                %if less, update weights and decrease Mu (u)
                w = w2;
                u = u/10;
            else
                %if greater, increment Mu (u)
                u = u*10;
            end
        end
    end
end

It's not easy to know the exact implementation of the Levenberg Marquardt algorithm in Matlab. 要知道Matlab中Levenberg Marquardt算法的确切实现并不容易。 You may try to run the algorithm one iteration at a time, and see if it is identical to your algorithm. 您可以尝试一次运行一次算法,然后查看它是否与您的算法相同。 You can also try other implementations, such as, http://www.mathworks.com/matlabcentral/fileexchange/16063-lmfsolve-m--levenberg-marquardt-fletcher-algorithm-for-nonlinear-least-squares-problems , to see if the performance can be improved. 您还可以尝试其他实现,例如http://www.mathworks.com/matlabcentral/fileexchange/16063-lmfsolve-m--levenberg-marquardt-fletcher-algorithm-for-nonlinear-least-squares-problems看看性能是否可以提高。 For simple learning problems, convergence speed may be a matter of learning rate. 对于简单的学习问题,收敛速度可能是学习速度的问题。 You might simply increase the learning rate to get faster convergence. 您可以简单地提高学习率以获得更快的收敛速度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM