简体   繁体   中英

Deep learning Gradient at last but output layer is always zero

I have been working with udacity self driving challenge#2. What ever changes I make to the deep network like learning rate, activation function, i am getting gradient zero issue while training. I have used both cross entropy loss and mse loss. For cross entropy 100 classes are used with degree difference of 10 ie radian angle of 0.17. For example from (-8.2 to -8.03) is class 0 and then (-8.03 to -7.86) is class 1 and so on.

Please find attached screen shots. As seen the layer before output (fc4 in the first image) almost becomes zero. So most of the gradient above almost follows the same pattern. Need some suggestion to eliminate this gradient zero error.

型号视图

Gradient_Zero_fc4_layer

This seems to be vanishing gradient problem, 1.) Have you tried Relu? (I know you said you have tried diff activation fn) 2.) Have you tried reducing # of layers? 3.) Are your features normalized?

There are architectures designed to prevent this as well (ex. LSTM ) but I think you should be able to get by with something simple like above.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM