简体繁体中英

Deep learning Gradient at last but output layer is always zero

原文 2016-11-01 19:44:38 4 1 tensorflow/ deep-learning

I have been working with udacity self driving challenge#2. What ever changes I make to the deep network like learning rate, activation function, i am getting gradient zero issue while training. I have used both cross entropy loss and mse loss. For cross entropy 100 classes are used with degree difference of 10 ie radian angle of 0.17. For example from (-8.2 to -8.03) is class 0 and then (-8.03 to -7.86) is class 1 and so on.

Please find attached screen shots. As seen the layer before output (fc4 in the first image) almost becomes zero. So most of the gradient above almost follows the same pattern. Need some suggestion to eliminate this gradient zero error.

型号视图

Gradient_Zero_fc4_layer

1 answers

This seems to be vanishing gradient problem, 1.) Have you tried Relu? (I know you said you have tried diff activation fn) 2.) Have you tried reducing # of layers? 3.) Are your features normalized?

There are architectures designed to prevent this as well (ex. LSTM ) but I think you should be able to get by with something simple like above.

Which layer of a deep learning model (DenseNet-121) to use as output when using model as feature extractor

Setting output variable in deep learning

What is a fused kernel (or fused layer) in deep learning?

Average layer in multi input deep learning

Deep learning, how to represent zero to many items?

ValueError: Input 0 of layer "sequential_8" is incompatible with the layer - deep learning model

Can Tensorflow/Deep Learning be used for Gradient Boosted Trees, Logistic regression?

Deep Learning architecture for input and output of different size?

Increase randomness of deep learning output by keras

Acces to last convolutional layer transfer learning

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Which layer of a deep learning model (DenseNet-121) to use as output when using model as feature extractor Setting output variable in deep learning What is a fused kernel (or fused layer) in deep learning? Average layer in multi input deep learning Deep learning, how to represent zero to many items? ValueError: Input 0 of layer "sequential_8" is incompatible with the layer - deep learning model Can Tensorflow/Deep Learning be used for Gradient Boosted Trees, Logistic regression? Deep Learning architecture for input and output of different size? Increase randomness of deep learning output by keras Acces to last convolutional layer transfer learning

Related Tags

Deep learning Gradient at last but output layer is always zero

Question

1 answers

solution1 0 2016-11-01 20:47:38

solution1
0 2016-11-01 20:47:38