简体   繁体   中英

When should Back-propagation algorithm be called during Neural Network training?

I have a working back-propagation algorithm that correctly minimizes the error when iterated 100,000 times over the same singular input , for example [ 1, 0 ] -> 1.

But I am not sure how to extend this to train the neural network when there are multiple inputs.

Suppose we wish to train the XOR function, with four possible input and output states:

[ 0, 0 ] -> 0

[ 0, 1 ] -> 1

[ 1, 0 ] -> 1

[ 1, 1 ] -> 0

I have tried calling the back-propagation algorithm after every single input-output test data. The network doesn't learn at all in this fashion even over large number of iterations.

Should I instead compute the accumulated error over the entire the training set (which is the 4 cases above) before calling back-propagation?

How is the accumulated errors to be stored and used for the entire training set in this example?

Thank you.

Both updating after every example, and accumulated versions are correct. They simply implement two slightly different algorithms, updating every step would make it an SGD (stochastic gradient descent) while the other GD (gradient descent). One can also do things in between, where you update every batch of data. The issues you are describing (lack of learning) have nothing to do with when the update takes place.

Note, that "correctly learning" one sample does not mean you have a bug free algorithm! A network where you only adjust a bias of the final layer should be able to do so if you only have one sample, but will fail for multiple. This is just one example of what can be broken yet pass your "single sample test".

If your model is a single-layer network, it will not be able to learn XOR function as it is linearly non-separable. If it has more than one layer, you should accumulate all errors and normalize them by the total number of all samples (in your case 4). Finally, the main reason for your problem might be due to the high learning rate that makes the parameters change too much. Try reducing the learning rate and increasing the number of iterations. See https://medium.com/analytics-vidhya/understanding-basics-of-deep-learning-by-solving-xor-problem-cb3ff6a18a06 for reference.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM