牛顿法梯度下降回溯算法中的 TypeError 和 ValueError

Question

I am trying to apply Newton's Method to a gradient descent algorithm with backtracking.我正在尝试将牛顿法应用于带有回溯的梯度下降算法。

Gradient Descent algorithm:梯度下降算法：

Gradient Descent with Backtracking algorithm:带回溯算法的梯度下降：

Newton's Method:牛顿法：

    import numpy as np
from scipy import optimize as opt

def newton_gd_backtracking(w,itmax,tol):
    # You may set bounds of "learnrate"
    max_learnrate =0.1
    min_learnrate =0.001

    for i in range(itmax):
        grad = opt.rosen_der(w)
        grad2 = (np.linalg.norm(grad))**2
        hess = opt.rosen_hess(w)

        # you have to decide "learnrate"
        learnrate = max_learnrate
        while True:
            f0 = opt.rosen(w)
            f1 = opt.rosen(w - learnrate * grad)
            if f1 <= (f0 - (learnrate/2)*grad2):
                break
            else:
                learnrate /=2;
            if learnrate< min_learnrate:
                learnrate = min_learnrate; break

        # now, Newton's method
        deltaw = - learnrate * np.linalg.inv(hess) * grad 
        w = w + deltaw

        if np.linalg.norm(deltaw) < tol:
            break

    return w, i, learnrate


# You can call the above function, by adding main

if __name__=="__main__":
    w0 = np.array([0,0])
    
    itmax = 10000; tol = 1.e-5

    w, i, learnrate = newton_gd_backtracking(w0,itmax,tol)
    print('Weight: ', w)
    print('Iterations: ', i)
    print('Learning Rate: ', learnrate)

After running the program, I get these error messages:运行程序后，我收到以下错误消息：

TypeError: only size-1 arrays can be converted to Python scalars类型错误：只有大小为 1 的数组可以转换为 Python 标量

The above exception was the direct cause of the following exception:上述异常是以下异常的直接原因：

Traceback (most recent call last):回溯（最近一次调用最后一次）：

File "c:/Users/Desfios 5/Desktop/Python/homework submission/Deyeon/GD_BT_Newton/Main_Newton_GD_Backtracking.py", line 43, in w, i, learnrate = newton_gd_backtracking(w0,itmax,tol)文件“c:/Users/Desfios 5/Desktop/Python/homework submit/Deyeon/GD_BT_Newton/Main_Newton_GD_Backtracking.py”，第43行，在w, i,learnrate = newton_gd_backtracking(w0,itmax,tol)

File "c:/Users/Desfios 5/Desktop/Python/homework submission/Deyeon/GD_BT_Newton/Main_Newton_GD_Backtracking.py", line 12, in newton_gd_backtracking hess = opt.rosen_hess(w)文件“c:/Users/Desfios 5/Desktop/Python/homework submit/Deyeon/GD_BT_Newton/Main_Newton_GD_Backtracking.py”，第 12 行，在 newton_gd_backtracking hess = opt.rosen_hess(w)

File "C:\\Users\\Desfios 5\\AppData\\Roaming\\Python\\Python38\\site-packages\\scipy\\optimize\\optimize.py", line 373, in rosen_hess diagonal[0] = 1200 * x[0]**2 - 400 * x 1 + 2文件“C:\\Users\\Desfios 5\\AppData\\Roaming\\Python\\Python38\\site-packages\\scipy\\optimize\\optimize.py”，第 373 行，位于 Rosen_hess 对角线 [0] = 1200 * x[0]**2 - 400 * x 1 + 2

ValueError: setting an array element with a sequence. ValueError：使用序列设置数组元素。

When I run this without the Hessian, as a normal gradient descent with backtracking, the code works fine.当我在没有 Hessian 的情况下运行它时，作为带有回溯的正常梯度下降，代码工作正常。 Here is the code that I used for normal gradient descent with backtracking:这是我用于带回溯的正常梯度下降的代码：

    import numpy as np
from scipy import optimize as opt

def gd_backtracking(w,itmax,tol):
    # You may set bounds of "learnrate"
    max_learnrate =0.1
    min_learnrate =0.001

    for i in range(itmax):
        grad = opt.rosen_der(w)
        grad2 = (np.linalg.norm(grad))**2

        # you have to decide "learnrate"
        learnrate = max_learnrate
        while True:
            f0 = opt.rosen(w)
            f1 = opt.rosen(w - learnrate * grad)
            if f1 <= (f0 - (learnrate/2)*grad2):
                break
            else:
                learnrate /=2;
            if learnrate< min_learnrate:
                learnrate = min_learnrate; break

        # now, march
        deltaw = - learnrate * grad
        w = w + deltaw

        if np.linalg.norm(deltaw) < tol:
            break

    return w, i, learnrate


# You can call the above function, by adding main

if __name__=="__main__":
    w0 = np.array([0,0])
    
    itmax = 10000; tol = 1.e-5

    w, i, learnrate = gd_backtracking(w0,itmax,tol)
    print('Weight: ', w)
    print('Iterations: ', i)
    print('Learning Rate: ', learnrate)

Is there something about the hessian matrix that I don't know about?有什么我不知道的关于 hessian 矩阵的东西吗？ To my understanding, the opt.rosen_hess should yield us a 1D array just like the opt.rosen_der.据我了解，opt.rosen_hess 应该像 opt.rosen_der 一样为我们生成一个一维数组。 Perhaps I'm using opt.rose_hess in a wrong way.也许我以错误的方式使用 opt.rose_hess。 What am I missing here?我在这里缺少什么？

Answer 1

After going through my work over and over, I realized my error.在一遍遍地工作之后，我意识到了我的错误。 Under newton_gd_backtesting, I was multiplying the inverse hessian and the gradient.在 newton_gd_backtesting 下，我将逆粗麻布和梯度相乘。 These aren't scalars so I should have done dot products.这些不是标量，所以我应该做点积。 Once I use the dot products, I got the desired results.一旦我使用点积，我就得到了想要的结果。

牛顿法梯度下降回溯算法中的 TypeError 和 ValueError

问题描述

1 个解决方案

解决方案1
0 2020-10-13 18:28:56

牛顿法梯度下降回溯算法中的 TypeError 和 ValueError

问题描述

1 个解决方案

解决方案1 0 2020-10-13 18:28:56

解决方案1
0 2020-10-13 18:28:56