After creating a following Neural Network:
nn = new BasicNetwork();
nn.addLayer(new BasicLayer(null, true, 29));
nn.addLayer(new BasicLayer(new ActivationReLU(), true, 1000));
nn.addLayer(new BasicLayer(new ActivationReLU(), true, 100));
nn.addLayer(new BasicLayer(new ActivationReLU(), true, 100));
nn.addLayer(new BasicLayer(new ActivationTANH() ,false, 4));
nn.getStructure().finalizeStructure();
nn.reset();
I experienced a mistake bigger 10^38. This is completely insane. Therefore I coded the error function by myself and noticed that the error still was that big. I first checked my IdealOutputs and noticed they were all in the range -1 to 1. The calculated Outputs though were way bigger than 1. Therefore I conclude a floating point error.
Am I correct with my conclusion? What can I do to avoid such stupid, time-consuming mistakes the next time?
Sincerely
Edit:
nn = new BasicNetwork();
nn.addLayer(new BasicLayer(null, true, 29));
nn.addLayer(new BasicLayer(new ActivationSigmoid(), true, 1000));
nn.addLayer(new BasicLayer(new ActivationSigmoid(), true, 100));
nn.addLayer(new BasicLayer(new ActivationSigmoid(), true, 100));
nn.addLayer(new BasicLayer(new ActivationTANH() ,false, 4));
nn.getStructure().finalizeStructure();
nn.reset();
The Problem still occurs after using Sigmoid functions. How to fix this?
- Write using a very smaller learning rate like 0.0001 or even smaller.
- Randomly initialize the weights.
- Initialize the biases as 1 initially.
- Try using Batch Normalization
The ReLU function actually cannot squeeze the values because the numbers being positive it acquires the y = x. Due to increasing gradients, this values goes on becoming greater.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.