简体   繁体   中英

Neural network online training

I want to implement a simple feed-forward neural network to approximate the function y=f(x)=ax^2 where a is some constant and x is the input value.

The NN has one input node , one hidden layer with 1-n nodes , and one output node . For example, I input the value 2.0 -> the NN produces 4.0, and again I input 3.0 -> the NN produces 9.0 or close to it and so on.

If I understand "online-training," the training data is fed one by one - meaning I input the value 2.0 -> I iterate with the gradient decent 100 times, and then I pass the value 3.0, and I iterate another 100 times.

However, when I try to do this with my experimental/learning NN - I input the value 2.0 -> the error gets very small -> the output is very close to 4.0.

Now if I want to predict for the input 3.0 -> the NN produces 4.36 or something instead of 9.0. So the NN just learns the last training value.

How can I use online-training to get a Neural Network that approximates the desired function for a range [-d, d]? What am I missing?

The reason why I like online-training is that eventually I want to input a time series - and map that series to the desired function. This is besides the point but in case someone was wondering.

Any advise would be greatly appreciated.

More info - I am activating the hidden layer with the Sigmoid function and the output layer with the linear one.

The reason why I like online-training is that eventually I want to input a time series - and map that series to the desired function.

Recurrent Neural Networks (RNNs) are the state of the art for modeling time series. This is because they can take inputs of arbitrary length, and they can also use internal state to model the changing behavior of the series over time.

Training feedforward neural networks for time series is an old method which will generally not perform as well. They require a fixed sized input so you must choose a fixed sized sliding time window, and they also don't preserve state, so it is hard to learn a time-varying function.

I can find very little about "online training" of feedforward neural nets with stochastic gradient descent to model non-stationary behavior except for a couple of very vague references. I don't think this provides any benefit besides allowing you to train in real time when you are getting a stream of data one at a time. I don't think it will actually help you model time-dependent behavior.

Most of the older methods I can find in the literature about online learning for neural networks use a hybrid approach with a neural network and some other method that can help capture time dependencies. Again, these should all be inferior to RNNs, not to mention harder to implement in practice.

Furthermore, I don't think you are implementing online training correctly. It should be stochastic gradient descent with a mini-batch size of 1. Therefore, you only run one iteration of gradient descent on each training example per training epoch. Since you are running 100 iterations before moving on to the next training example, you are going too far down the error gradient with respect to that single example, resulting in serious overfitting to a single data point. This is why you get poor results on the next input. I don't think this is a justifiable method of training, nor do I think it will work for time series.

You haven't mentioned what your activations are or your loss function is, so I can't comment on whether those are appropriate for the task.

Also, I don't think the learning y=ax^2 is a good analogy for time series prediction. This is a static function that always gives the same output for a given input, regardless of the index of the input or the value of previous inputs.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM