简体   繁体   中英

Clarification on back propagation

As I usually do when isolated at home for too long, I was thinking about back-propagation.

If my thought process is correct, for computing the weights update we never actually need to compute the cost. We only ever need to compute the derivative of the cost.

Is this correct?

I imagine that the only reason to compute the Cost would be to check if the network is actually learning.

I really believe I am correct, but by checking on the internet no one seems to make this observation. So maybe I am wrong. If I am, I have a deep misunderstanding of backpropagation that I need to fix.

You are correct.

The cost function is what tells you how much the solution costs. The gradient is what carries the information about how to make it cost less.

You could shift the cost with any constant addition or subtraction and it wouldn't make a difference, because there is no way to make that part of the cost go down.

Yes. Back propagation (auto-differentiation) needs gradients, not the loss. Once the forward path is formulated, then all we need to formulate the gradients are available.

Another justification is that the back propagation formula is the chain rule in which there is no loss value.

I really believe I am correct, but by checking on the internet no one seems to make this observation.

Indeed. NN articles or textbook always talk about Loss but not clear that all we need for back-propagation are gradients in the chain rule by which we can do gradient descents.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM