[英]Simple Linear Regression not converging
在我尝试深入挖掘机器学习模型背后的数学时,我正在使用矢量化在 Python 中实现普通最小二乘算法。 我的参考资料是:
这就是我现在所拥有的:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
%matplotlib inline
X, y = datasets.load_diabetes(return_X_y=True)
# We only take the first feature (for visualization purposes).
X = X[:, np.newaxis, 2]
# Split the data into training/testing sets
X_train = X[:-20]
X_test = X[-20:]
y_train = y[:-20]
y_test = y[-20:]
# Input data
sns.scatterplot(
x=X_train[:, 0],
y=y_train,
label="train",
edgecolor=None,
color="blue"
)
# To predict
sns.scatterplot(
x=X_test[:, 0],
y=y_test,
label="test",
edgecolor=None,
marker="*",
color="red",
);
class LinearRegression:
"""
Ordinary least squares Linear Regression.
Args:
"""
def __init__(self, learning_rate: float = 0.01, tolerance: float = 1e4, standardize: bool = True):
# TODO: standardize if required
self._learning_rate: float = learning_rate
self._tolerance: float = tolerance
self._standardize: bool = standardize
self._fitted: bool = False
def fit(self, X: np.ndarray, y: np.ndarray) -> None:
"""Fit linear model."""
self._X: np.ndarray = X
self._y: np.ndarray = y[:, np.newaxis]
self._m, self._n = self._X.shape # rows, features
self._weights: np.ndarray = np.zeros((self._n, 1))
self._train()
def predict(self, X: np.ndarray, add_bias: bool = True) -> np.ndarray:
"""Predict using the linear model."""
assert self._fitted, "Model not fitted."
if add_bias:
X = np.c_[np.ones((X.shape[0], 1)), X]
predictions = np.dot(X, self._weights)
return predictions
def _train(self) -> None:
"""
Generate the clusters from the traning data.
Algorithm:
1. Initiliaze weights.
2. Compute the cost.
3. Calculate the gradient.
4. Update weights.
4. Repeat from 2 until convergence.
"""
# Add bias term
self._X = np.c_[np.ones((self._m, 1)), self._X]
self._weights = np.r_[np.ones((1, 1)), self._weights]
self._fitted = True
converged = False
iterations = 0
while not converged:
iterations += 1
y_hat = self.predict(self._X, add_bias=False)
residuals = self._residuals(self._y, y_hat)
gradients = self._gradients(self._X, residuals)
self._weights -= self._learning_rate * gradients
gradient_magnitude = np.linalg.norm(gradients)
print(gradient_magnitude)
if gradient_magnitude < self._tolerance:
converged = True
print(self._weights)
print(iterations)
def _residuals(self, y: np.ndarray, y_hat: np.ndarray) -> np.ndarray:
residuals = y - y_hat
return residuals
def _gradients(self, X: np.ndarray, residuals: np.ndarray) -> np.ndarray:
gradients = -2 * np.dot(X.T, residuals)
return gradients
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
clf = LinearRegression()
clf.fit(X_train, y_train)
我面临的问题是我的体重不断增加,直到我最终得到一堆 nans。 我一直试图找出我错过了什么,但到目前为止还没有运气。 还尝试调整容差阈值,但我认为这不是问题,而是我的数学有问题。
您的代码实际上似乎工作正常; 除了学习率,真的! 只需将它从0.01
减少到例如0.0001
并且一切正常(好吧,我也会将容差减少到小得多的东西,例如1e-5
,以确保它实际上收敛到正确的解决方案)。
显示它有效的小图像:
clf = LinearRegression(learning_rate=0.0001)
clf.fit(X_train, y_train)
b, m = clf._weights[:, 0]
plt.scatter(X_train[:, 0], y_train)
plt.plot([-2, 4], [-2 * m + b, 4 * m + b])
给
线性回归是一个凸优化问题,因此您可以将其想象为将球放在抛物线上,然后将其向底部移动固定量的空间乘以您所在的 position 的斜率。 如果那个“固定量”足够小,你就会越来越接近底部,直到找到最佳的 position。 than where you started from.但是如果你得到的值太大,你就会从抛物线的一侧跳到另一侧,如果它足够大,你就会降落在一个实际上比你开始的地方的地方。 重复几次,您确实会遇到您所遇到的确切情况......
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.