简体   繁体   English

python中的梯度下降函数 - 损失函数或权重错误

[英]Gradient descent function in python - error in loss function or weights

Im working with the gradient function for an exercise but i still couldn't get the expected outcome.我正在使用梯度函数进行练习,但我仍然无法获得预期的结果。 That is, i receive 2 error messages:也就是说,我收到 2 条错误消息:

  1. Wrong output for the loss function.损失函数的错误输出。 Check how you are implementing the matrix multiplications.检查您如何实现矩阵乘法。

  2. Wrong values for weight's matrix theta.权重矩阵 theta 的值错误。 Check how you are updating the matrix of weights.检查您如何更新权重矩阵。

When applying the function (see below) i notice that the cost decreases at each iteration but still it does not converge to the desired outcome in the exercise.应用该函数时(见下文),我注意到每次迭代的成本都会降低,但它仍然没有收敛到练习中的预期结果。 I already tried several adaptations on the formula but couldn't solve it yet.我已经在公式上尝试了几次修改,但还没有解决。

# gradientDescent

def gradientDescent(x, y, theta, alpha, num_iters): def gradientDescent(x, y, theta, alpha, num_iters):

Input:
    x: matrix of features which is (m,n+1)
    y: corresponding labels of the input matrix x, dimensions (m,1)
    theta: weight vector of dimension (n+1,1)
    alpha: learning rate
    num_iters: number of iterations you want to train your model for
Output:
    J: the final cost
    theta: your final weight vector
Hint: you might want to print the cost to make sure that it is going down.

### START CODE HERE ###
# get 'm', the number of rows in matrix x
m = len(x)

for i in range(0, num_iters):
    
    # get z, the dot product of x and theta
    # z = predictins
    z = np.dot(x, theta)
    h = sigmoid(z)
    loss = z - y
    
    # calculate the cost function
    J =  (-1/m) * np.sum(loss)
    print("Iteration %d | Cost: %f" % (i, J))#
            
    gradient = np.dot(x.T, loss)
    
    #update theta
    theta = theta - (1/m) * alpha * gradient
    
    
### END CODE HERE ###
J = float(J)
return J, theta

The issue is that i wrongly applied the formula of the cost function and the formula for calculating the weights:问题是我错误地应用了成本函数的公式和计算权重的公式:

𝐽=−1/𝑚×(𝐲𝑇⋅𝑙𝑜𝑔(𝐡)+(1−𝐲)𝑇⋅𝑙𝑜𝑔(1−𝐡)) 𝐽=−1/𝑚×(𝐲𝑇⋅𝑙𝑜𝑔(𝐡)+(1−𝐲)𝑇⋅𝑙𝑜𝑔(1−𝐡))

𝜃=𝜃−𝛼/𝑚×(𝐱𝑇⋅(𝐡−𝐲)) 𝜃=𝜃−𝛼/𝑚×(𝐱𝑇⋅(𝐡−𝐲))

The solution is:解决方案是:

 J =  (-1/m) * (np.dot(y.T, np.log(h)) +  (np.dot((1-y).T, np.log(1-h)))
 theta = theta - (alpha/m) * gradient

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM