多元回归值的梯度下降不收敛

Question

I have tried this piece of code for multi variable regression for finding the coefficients but couldn't find where I am making mistake or if I am on the right path?我已经尝试了这段用于多变量回归的代码来查找系数，但找不到我犯错的地方或者我是否在正确的道路上？ The problem is the mse value not getting converged.问题是 mse 值没有收敛。

here x1 , x2 , x3 are the 3 feature variables i am having(i have sliced each feature column into these x1 , x2 ,x3 vars)这里 x1 、 x2 、 x3 是我拥有的 3 个特征变量（我将每个特征列切成这些 x1 、 x2 、x3 变量）

def gradientDescent(x,y):
   mCurrent1=mCurrent2=mCurrent3=bCurrent=0
   iteration=1000
   learningRate=0.0000001
   n=len(x)


   for i in range(0,iteration):
       y_predict=mCurrent1*x1+mCurrent2*x2+mCurrent3*x3+bCurrent
       mse=(1/n)*np.sum([val**2 for val in (y-y_predict)])


       mPartDerivative1=-(2/n)*np.sum(x1*(y-y_predict))
       mPartDerivative2=-(2/n)*np.sum(x2*(y-y_predict))
       mPartDerivative3=-(2/n)*np.sum(x3*(y-y_predict))

       bPartDerivative=-(2/n)*np.sum(y-y_predict)

       mCurrent1=mCurrent1-(learningRate*mPartDerivative1)
       mCurrent2=mCurrent2-(learningRate*mPartDerivative2)
       mCurrent3=mCurrent3-(learningRate*mPartDerivative3)

       bCurrent=bCurrent-(learningRate*bPartDerivative)
       print('m1:{} m2:{} m3:{} b:{} iter:{} mse:{}'.format(mCurrent1,mCurrent2,mCurrent3,bCurrent,i,mse))

    return(round(mCurrent1,3),round(mCurrent2,3),round(mCurrent3,3),round(bCurrent,3))

Answer 1

It looks like you program should work.看起来你的程序应该可以工作。 However, it's likely your learning rate is too small.但是，您的学习率可能太小了。 Remember that the learning rate is the size of the step you are taking down your cost function.请记住，学习率是您减少成本函数的步骤的大小。 If a learning rate is too small, it will move down the cost curve too slowly, and it will take a long time to reach convergence (requiring a large iteration number).如果一个学习率太小，它会沿着成本曲线的下移太慢，并且需要很长时间才能达到收敛（需要很大的迭代次数）。 However, if learning rate is too large, then you have a problem of divergence.但是，如果学习率太大，那么就会出现发散的问题。 Picking the correct learning rate and number of iterations (in other words, tuning your hyperparameters) is more of an art than a science.选择正确的学习率和迭代次数（换句话说，调整超参数）与其说是科学，不如说是一门艺术。 You should play around with different learning rates.你应该尝试不同的学习率。

I created my own dataset and randomly generated data (where (m1, m2, m3, b) = (10, 5, 4, 2) ) and ran your code:我创建了自己的数据集和随机生成的数据（其中(m1, m2, m3, b) = (10, 5, 4, 2) ）并运行您的代码：

import pandas as pd
import numpy as np

x1 = np.random.rand(100,1)
x2 = np.random.rand(100,1)
x3 = np.random.rand(100,1)
y = 2 + 10 * x1 + 5 * x2 + 4 * x3 + 2 * np.random.randn(100,1)
df = pd.DataFrame(np.c_[y,x1,x2,x3],columns=['y','x1','x2','x3'])

#df.head()
#            y        x1        x2        x3
# 0  11.970573  0.785165  0.012989  0.634274
# 1  19.980349  0.919672  0.971063  0.752341
# 2   2.884538  0.170164  0.991058  0.003270
# 3   8.437686  0.474261  0.326746  0.653011
# 4  14.026173  0.509091  0.921010  0.375524

Running your algorithm with a learning rate of 0.0000001 yields the following results:以0.0000001的学习率运行算法会产生以下结果：

(m1, m2, m3, b) = (0.001, 0.001, 0.001, 0.002)

Running your algorithm with a learning rate of .1 yields the following results:以.1的学习率运行您的算法会产生以下结果：

(m1, m2, m3, b) = (9.382, 4.841, 4.117, 2.485)

Notice that when the learning rate is 0.0000001 , your coefficients are not too different from where they started ( 0 ).请注意，当学习率为0.0000001 ，您的系数与它们开始时的位置 ( 0 ) 没有太大区别。 Like I said earlier, the small learning rate is making it so we change the coefficients at too small of a rate since we are moving down the cost function at super small step sizes.就像我之前说的，小学习率使它成为这样，所以我们以太小的速率改变系数，因为我们以超小步长向下移动成本函数。

I have added a picture to help visualize picking a step size.我添加了一张图片来帮助可视化选择步长。 Notice that the first picture uses a small learning rate, and the second uses a larger learning rate.请注意，第一张图片使用了较小的学习率，第二张图片使用了较大的学习率。

Small learning rate:小学习率：

Large learning rate:大学习率：

多元回归值的梯度下降不收敛

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-01-24 20:46:35

多元回归值的梯度下降不收敛

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-01-24 20:46:35

解决方案1
1 已采纳 2019-01-24 20:46:35