简体   繁体   English

python中的梯度下降实现?

[英]Gradient Descent implementation in python?

I have tried to implement gradient descent and it was working properly when I tested it on sample dataset but it's not working properly for boston dataset.我试图实现梯度下降,当我在样本数据集上测试它时它工作正常,但它不适用于波士顿数据集。

Can you verify what's wrong with the code.你能验证代码有什么问题吗? why I'm not getting a correct theta vector?为什么我没有得到正确的 theta 向量?

import numpy as np
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split

X = load_boston().data
y = load_boston().target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
X_train1 = np.c_[np.ones((len(X_train), 1)), X_train]
X_test1 = np.c_[np.ones((len(X_test), 1)), X_test]

eta = 0.0001
n_iterations = 100
m = len(X_train1)
tol = 0.00001

theta = np.random.randn(14, 1)

for i in range(n_iterations):
    gradients = 2/m * X_train1.T.dot(X_train1.dot(theta) - y_train)
    if np.linalg.norm(X_train1) < tol:
        break
    theta = theta - (eta * gradients)

I'm getting my weight vector in the shape of (14, 354).我的权重向量的形状为 (14, 354)。 What am I doing wrong here?我在这里做错了什么?

Consider this (unrolled some statements for better visibility):考虑一下(展开一些语句以获得更好的可见性):

for i in range(n_iterations):
    y_hat = X_train1.dot(theta)
    error = y_hat - y_train[:, None]
    gradients = 2/m * X_train1.T.dot(error)

    if np.linalg.norm(X_train1) < tol:
        break
    theta = theta - (eta * gradients)

since y_hat is (n_samples, 1) and y_train is (n_samples,) - for your example n_samples is 354 - you need to bring y_train to the same dimension with a dummy axis trick y_train[:, None] .由于 y_hat 是 (n_samples, 1) 并且 y_train 是 (n_samples,) - 对于您的示例 n_samples 是 354 - 您需要使用虚拟轴技巧y_train[:, None]将 y_train 带到相同的维度。

y_train here is a 1-dimensional NP array (ndim=1) whereas X_train1.dot(theta) is a 2-D NP array (ndim=2).这里的 y_train 是一维 NP 数组 (ndim=1) 而 X_train1.dot(theta) 是一个二维 NP 数组 (ndim=2)。 When you do subtraction, y_train gets broadcasted to the same dimension as the other.当你做减法时, y_train 被广播到与另一个相同的维度。 To address this you could convert the y_train also to a 2-D array.为了解决这个问题,您还可以将 y_train 转换为二维数组。 You can do this by y_train.reshape(-1,1).你可以通过 y_train.reshape(-1,1) 来做到这一点。

for i in range(n_iterations):
gradients = 2/m * X_train1.T.dot(X_train1.dot(theta) - y_train.reshape(-1,1))
if np.linalg.norm(X_train1) < tol:
    break
theta = theta - (eta * gradients)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM