I am trying to read the following code for back propagation in python
probs = exp_scores /np.sum(exp_scores, axis=1, keepdims=True)
#Backpropagation
delta3 = probs
delta3[range(num_examples), y] -= 1
dW2 = (a1.T).dot(delta3)
....
but I cannot understand the following line of the code:
delta3[range(num_examples), y] -= 1
could you please tell me what does this do?
Thank you very much for your help!
There are two things here. First it is using numpy slicing to select only a fraction of delta3
. Secondly it is removing 1 to every element of this fraction of the matrix.
More precisely, delta3[range(num_example), y]
is selecting lines of the matrix delta3
ranging from 0 to num_examples
but only selecting column y
.
If you're interested, why it's computed this way, it's the backpropagation through cross-entropy loss:
probs
is the vector of class probabilities (computed in a forward pass via softmax). delta3
is the error signal from the loss function. y
holds the ground truth classes for the mini-batch. Everything else is just a math, which is well explained in this post and they end up with the same numpy expression.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.