简体   繁体   English

Java XOR 神经网络未正确训练

[英]Java XOR Neural Network not training properly

I have a neural network with 2 inputs, 2 hidden neurons and 1 output neuron to solve the xor problem.我有一个带有 2 个输入、2 个隐藏神经元和 1 个输出神经元的神经网络来解决xor问题。 I randomly initialise the weights between 0 and 1, I use a learning rate of 0.1 with sigmoid activation function.我随机初始化 0 和 1 之间的权重,我使用0.1的学习率和sigmoid激活函数。

When I train only one option, for example 1 and 0 with a target of 1, it works fine and gives an appropriate guess.当我只训练一个选项时,例如 1 和 0,目标为 1,它工作正常并给出适当的猜测。 However, when I try and train all the possible inputs together, the output converges around 0.5-0.6 .但是,当我尝试将所有可能的输入一起训练时,输出会收敛到0.5-0.6左右。

I have tried changing the learning rate, the range that the weights are randomly initialised and the number of times the network is trained, however it makes no difference to the final output.我曾尝试更改学习率、随机初始化权重的范围以及训练网络的次数,但这对最终输出没有影响。

Here is a link to my code on GitHub .这是我在GitHub 上的代码的链接。

Any ideas on how I could fix this issue?关于如何解决这个问题的任何想法?

I suspect that the backpropagation isn't implemented properly.我怀疑反向传播没有正确实施。 An overview is given in eg http://users.pja.edu.pl/~msyd/wyk-nai/multiLayerNN-en.pdf in particular pages 17 to 20.概述在例如http://users.pja.edu.pl/~msyd/wyk-nai/multiLayerNN-en.pdf中给出,特别是第 17 到 20 页。

The tuneWeigths - and the delta_weights -method of the Output_Neuron -class are implemented properly. tuneWeigths - 和Output_Neuron类的delta_weights方法已正确实现。 However, in this step the array weightDeltaHidden (see comment in the code) must be determined that will be needed later when the weights of the Hidden_Neuron -class are tuned.但是,在此步骤中,必须确定数组weightDeltaHidden (请参阅代码中的注释),稍后调整Hidden_Neuron类的权重时将需要该Hidden_Neuron

The tuneWeigths - and the delta_weights -method of the Hidden_Neuron -class don't seem to be implemented properly. tuneWeigths - 和Hidden_Neuron -class 的delta_weights -method 似乎没有正确实现。 Here, among other things, the previously determined array weightDeltaHidden must be used.在这里,除其他外,必须使用先前确定的数组weightDeltaHidden

In the code below I've made the necessary changes without essentially changing the design of the code.在下面的代码中,我进行了必要的更改,而没有从根本上改变代码的设计。 But maybe a refactoring makes sense.但也许重构是有意义的。

Changes in the Output_Neuron -class: Output_Neuron类的变化:

...

private double[] weightedDeltaHidden;

...

Output_Neuron(int hiddenNeurons) {

    ...

    this.weightedDeltaHidden = new double[hiddenNeurons];
}

...

void tuneWeights(double LR, double[] hidden_output, int target) {
    double delta = (target - output) * f.dSigmoid(output);
    for (int i = 0; i < weights.length; i++) {
        weights[i] += delta_weights(i, LR, delta, hidden_output);
    }
}

double delta_weights(int i, double LR, double delta, double[] hidden_output) {
    weightedDeltaHidden[i] = delta * weights[i]; // weightedDeltaHidden is the product of delta of this output neuron and the weight of the i-th hidden neuron.
                                                 // That value is needed when the weights of the hidden neurons are tuned...
    return LR * delta * hidden_output[i];
}

...

double[] getWeightedDeltaHidden() {
    return weightedDeltaHidden;
}

Changes in Hidden_Neuron -class: Hidden_Neuron类的变化:

...

void tuneWeights(double LR, int[] inputs, double weightedDeltaHiddenTotal) {
    for (int i = 0; i < weights.length; i++) {
        weights[i] += delta_weights(LR, inputs[i], weightedDeltaHiddenTotal);
    }
}

private double delta_weights(double LR, double input, double weightedDeltaHiddenTotal) {
    double deltaOutput = f.dSigmoid(output) * weightedDeltaHiddenTotal;
    return LR * deltaOutput * input;
}

...

Changes in the Network -class inside the train -method where the tuning of the hidden weights takes place: trainNetwork类的变化 - 调整隐藏权重的方法:

void train(int[] inputs, int target) {

    ...

    //tune Hidden weights
    for (int i = 0; i < numOfHiddenNeurons; i++) {
        double weightedDeltaHiddenTotal = 0;
        for (int j = 0; j < numOfOutputNeurons; j++) {
            weightedDeltaHiddenTotal += output_neurons[j].getWeightedDeltaHidden()[i]; // weightedDeltaHiddenTotal is the sum of the weightedDeltaHidden over all output neurons. Each weightedDeltaHidden
        }                                                                              // is the product of delta of the j-th output neuron and the weight of the i-th hidden neuron.
        hidden_neurons[i].tuneWeights(LR, inputs, weightedDeltaHiddenTotal);
    }
}

With those changes, a typical output for 1_000_000 train -calls (2 hidden neurons) is有了这些变化,1_000_000 次train调用(2 个隐藏神经元)的典型输出是

Error: 1.9212e-01 in cycle 0
Error: 8.9284e-03 in cycle 100000
Error: 1.5049e-03 in cycle 200000
Error: 4.7214e-03 in cycle 300000
Error: 4.4727e-03 in cycle 400000
Error: 2.1179e-03 in cycle 500000
Error: 2.9165e-04 in cycle 600000
Error: 2.0655e-03 in cycle 700000
Error: 1.5381e-03 in cycle 800000
Error: 1.0440e-03 in cycle 900000
0 0: 0.0170
1 0: 0.9616
0 1: 0.9612
1 1: 0.0597

and for 100_000_000 train -calls (2 hidden neurons)对于 100_000_000 次train调用(2 个隐藏神经元)

Error: 2.4755e-01 in cycle 0
Error: 2.7771e-04 in cycle 5000000
Error: 6.8378e-06 in cycle 10000000
Error: 5.4317e-05 in cycle 15000000
Error: 6.8956e-05 in cycle 20000000
Error: 2.1072e-06 in cycle 25000000
Error: 2.6281e-05 in cycle 30000000
Error: 2.1630e-05 in cycle 35000000
Error: 1.1546e-06 in cycle 40000000
Error: 1.7690e-05 in cycle 45000000
Error: 8.6837e-07 in cycle 50000000
Error: 1.3603e-05 in cycle 55000000
Error: 1.2905e-05 in cycle 60000000
Error: 2.1657e-05 in cycle 65000000
Error: 1.1594e-05 in cycle 70000000
Error: 1.9191e-05 in cycle 75000000
Error: 1.7273e-05 in cycle 80000000
Error: 9.1364e-06 in cycle 85000000
Error: 1.5221e-05 in cycle 90000000
Error: 1.4501e-05 in cycle 95000000
0 0: 0.0008
1 0: 0.9961
0 1: 0.9961
1 1: 0.0053

An increase of the hidden neurons increases the performance.隐藏神经元的增加提高了性能。 Below a typical output for 1_000_000 train -calls (4 hidden neurons) is shown:下面显示了 1_000_000 次train调用(4 个隐藏神经元)的典型输出:

Error: 1.2617e-02 in cycle 0
Error: 7.9950e-04 in cycle 100000
Error: 4.2567e-04 in cycle 200000
Error: 1.7279e-04 in cycle 300000
Error: 1.2246e-04 in cycle 400000
Error: 1.0456e-04 in cycle 500000
Error: 6.9140e-05 in cycle 600000
Error: 6.8698e-05 in cycle 700000
Error: 5.1640e-05 in cycle 800000
Error: 4.4534e-05 in cycle 900000
0 0: 0.0092
1 0: 0.9905
0 1: 0.9912
1 1: 0.0089

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM