[英]Gradient descent function in python - error in loss function or weights
我正在使用梯度函數進行練習,但我仍然無法獲得預期的結果。 也就是說,我收到 2 條錯誤消息:
損失函數的錯誤輸出。 檢查您如何實現矩陣乘法。
權重矩陣 theta 的值錯誤。 檢查您如何更新權重矩陣。
應用該函數時(見下文),我注意到每次迭代的成本都會降低,但它仍然沒有收斂到練習中的預期結果。 我已經在公式上嘗試了幾次修改,但還沒有解決。
# gradientDescent
def gradientDescent(x, y, theta, alpha, num_iters):
Input:
x: matrix of features which is (m,n+1)
y: corresponding labels of the input matrix x, dimensions (m,1)
theta: weight vector of dimension (n+1,1)
alpha: learning rate
num_iters: number of iterations you want to train your model for
Output:
J: the final cost
theta: your final weight vector
Hint: you might want to print the cost to make sure that it is going down.
### START CODE HERE ###
# get 'm', the number of rows in matrix x
m = len(x)
for i in range(0, num_iters):
# get z, the dot product of x and theta
# z = predictins
z = np.dot(x, theta)
h = sigmoid(z)
loss = z - y
# calculate the cost function
J = (-1/m) * np.sum(loss)
print("Iteration %d | Cost: %f" % (i, J))#
gradient = np.dot(x.T, loss)
#update theta
theta = theta - (1/m) * alpha * gradient
### END CODE HERE ###
J = float(J)
return J, theta
問題是我錯誤地應用了成本函數的公式和計算權重的公式:
𝐽=−1/𝑚×(𝐲𝑇⋅𝑙𝑜𝑔(𝐡)+(1−𝐲)𝑇⋅𝑙𝑜𝑔(1−𝐡))
𝜃=𝜃−𝛼/𝑚×(𝐱𝑇⋅(𝐡−𝐲))
解決方案是:
J = (-1/m) * (np.dot(y.T, np.log(h)) + (np.dot((1-y).T, np.log(1-h)))
theta = theta - (alpha/m) * gradient
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.