简体   繁体   中英

Backpropagation not working: Neural Network Java

I have created a simple neural network with 3 layers according to this python example: Link (PS: You have to scroll down until you reach Part 2)

This is my Java implementation of the code:

private void trainNet()
{
    // INPUT is a 4*3 matrix
    // SYNAPSES is a 3*4 matrix
    // SYNAPSES2 is a 4*1 matrix
    // 4*3 matrix DOT 3*4 matrix => 4*4 matrix: unrefined test results
    double[][] layer1 = sigmoid(dot(inputs, synapses), false);

    // 4*4 matrix DOT 4*1 matrix => 4*1 matrix: 4 final test results
    double[][] layer2 = sigmoid(dot(layer1, synapses2), false);

    // 4*1 matrix - 4*1 matrix => 4*1 matrix: error of 4 test results
    double[][] layer2Error = subtract(outputs, layer2);

    // 4*1 matrix DOT 4*1 matrix => 4*1 matrix: percentage of change of 4 test results
    double[][] layer2Delta = dot(layer2Error, sigmoid(layer2, true));

    // 4*1 matrix DOT 3*1 matrix => 4*1 matrix
    double[][] layer1Error = dot(layer2Delta, synapses2);

    // 4*1 matrix DOT 4*4 matrix => 4*4 matrix: percentage of change of 4 test results
    double[][] layer1Delta = dot(layer1Error, sigmoid(layer1, true));

    double[][] transposedInputs = transpose(inputs);
    double[][] transposedLayer1 = transpose(layer1);

    //  4*4 matrix DOT 4*1 matrix => 4*1 matrix: the updated weights
    // Update the weights
    synapses2 = sum(synapses2, dot(transposedLayer1, layer2Delta));

    // 3*4 matrix DOT 4*4 matrix => 3*4 matrix: the updated weights
    // Update the weights
    synapses = sum(synapses, dot(transposedInputs, layer1Delta));

    // Test each value of two 4*1 matrices with each other
    testValue(layer2, outputs);
}

The dot, sum, subtract and transpose functions I have created myself and I'm pretty sure they do their job perfectly.

The first batch of inputs gives me a error of about 0.4 which is alright, because the weights are of random value. On the second run the error margin is smaller, but only by a very tine amount (0.001)

After 500,000 batches (so 2,000,000 tests in total) the network still hasn't given out any correct value! So I tried using an even larger amount of batches. Using 1,000,000 batches (so 4,000,000 tests in total), the network generates a whopping 16,900 correct results.

Could anyone please tell me what's going on?

These were the used weights:

First layer:

  • 2.038829298171684 2.816232761170282 1.6740269469812146 1.634422766238497
  • 1.5890997594993828 1.7909325329112222 2.101840236824494 1.063579126586681
  • 3.761238407071311 3.757148454039234 3.7557450538398176 3.6715972104291605

Second layer:

  • -0.019603811941904248
  • 218.38253323323553
  • 53.70133275445734
  • -272.83589796861514

    EDIT: Thanks to lsnare for pointing out to me using a library would be way easier!

For those interested here is the working code using math.nist.gov/javanumerics library:

private void trainNet()
{
    // INPUT is a 4*3 matrix
    // SYNAPSES is a 3*4 matrix
    // SYNAPSES2 is a 4*1 matrix
    // 4*3 matrix DOT 3*4 matrix => 4*4 matrix: unrefined test results
    Matrix hiddenLayer = sigmoid(inputs.times(synapses), false);

    // 4*4 matrix DOT 4*1 matrix => 4*1 matrix: 4 final test results
    Matrix outputLayer = sigmoid(hiddenLayer.times(synapses2), false);

    // 4*1 matrix - 4*1 matrix => 4*1 matrix: error of 4 test results
    Matrix outputLayerError = outputs.minus(outputLayer);

    // 4*1 matrix DOT 4*1 matrix => 4*1 matrix: percentage of change of 4 test results
    Matrix outputLayerDelta = outputLayerError.arrayTimes(sigmoid(outputLayer, true));

    // 4*1 matrix DOT 1*4 matrix => 4*4 matrix
    Matrix hiddenLayerError = outputLayerDelta.times(synapses2.transpose());

    // 4*4 matrix DOT 4*4 matrix => 4*4 matrix: percentage of change of 4 test results
    Matrix hiddenLayerDelta = hiddenLayerError.arrayTimes(sigmoid(hiddenLayer, true));

    //  4*4 matrix DOT 4*1 matrix => 4*1 matrix: the updated weights
    // Update the weights
    synapses2 = synapses2.plus(hiddenLayer.transpose().times(outputLayerDelta));

    // 3*4 matrix DOT 4*4 matrix => 3*4 matrix: the updated weights
    // Update the weights
    synapses = synapses.plus(inputs.transpose().times(hiddenLayerDelta));

    // Test each value of two 4*1 matrices with each other
    testValue(outputLayer.getArrayCopy(), outputs.getArrayCopy());
}

In general, when writing code that involves advanced mathematical or numerical computation (such as linear algebra) it's best to use existing libraries written by experts in the field, rather than write your own functions. Standard libraries will produce more accurate results and are most likely more efficient. For example, in the blog that you reference, the author uses the numpy library to compute dot products and transposition of matrices. For Java, you could use the Java Matrix Package (JAMA) that was developed by NIST: http://math.nist.gov/javanumerics/jama/
For example, to transpose a matrix:

double[4][3] in = {{0,0,1},{0,1,1},{1,0,1},{1,1,1}};
Matrix input = new Matrix(in);
input = input.transpose();

I'm not sure if this will solve your issue completely, but hopefully this could help save you writing extra code in the future.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM