[英]Logistic regression from scratch
I am implementing multinomial logistic regression using gradient descent + L2 regularization on the MNIST dataset.我正在 MNIST 数据集上使用梯度下降 + L2 正则化实现多项逻辑回归。 My training data is a dataframe with shape (n_samples=1198, features=65).我的训练数据是一个形状为 (n_samples=1198, features=65) 的数据框。 On each iteration of gradient descent, I take a linear combination of the weights and inputs to obtain 1198 activations (beta^T * X).在梯度下降的每次迭代中,我对权重和输入进行线性组合以获得 1198 个激活 (beta^T * X)。 I then pass these activations through a softmax function.然后我通过 softmax 函数传递这些激活。 However, I am confused about how I would obtain a probability distribution over 10 output classes for each activation?但是,我对如何为每个激活获得 10 个输出类的概率分布感到困惑?
My weights are initialized as such我的权重是这样初始化的
n_features = 65
# init random weights
beta = np.random.uniform(0, 1, n_features).reshape(1, -1)
This is my current implementation.这是我目前的实现。
def softmax(x:np.ndarray):
exps = np.exp(x)
return exps/np.sum(exps, axis=0)
def cross_entropy(y_hat:np.ndarray, y:np.ndarray, beta:np.ndarray) -> float:
"""
Computes cross entropy for multiclass classification
y_hat: predicted classes, n_samples x n_feats
y: ground truth classes, n_samples x 1
"""
n = len(y)
return - np.sum(y * np.log(y_hat) + beta**2 / n)
def gd(X:pd.DataFrame, y:pd.Series, beta:np.ndarray,
lr:float, N:int, iterations:int) -> (np.ndarray,np.ndarray):
"""
Gradient descent
"""
n = len(y)
cost_history = np.zeros(iterations)
for it in range(iterations):
activations = X.dot(beta.T).values
y_hat = softmax(activations)
cost_history[it] = cross_entropy(y_hat, y, beta)
# gradient of weights
grads = np.sum((y_hat - y) * X).values
# update weights
beta = beta - lr * (grads + 2/n * beta)
return beta, cost_history
In Multinomial Logistic Regression, you need a separate set of parameters (the pixel weights in your case) for every class .在多项 Logistic 回归中,您需要为每个类设置一组单独的参数(您的案例中的像素权重)。 The probability of an instance belonging to a certain class is then estimated as the softmax function of the instance's score for that class.然后将实例属于某个类的概率估计为该类实例得分的 softmax 函数。 The softmax function makes sure that the estimated probabilities sum to 1 over all classes. softmax 函数确保所有类别的估计概率总和为 1。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.