简体   繁体   English

向量化梯度下降算法

[英]Vectorizing a gradient descent algorithm

I am coding gradient descent in matlab.我在 matlab 中编码梯度下降。 For two features, I get for the update step:对于两个功能,我得到了更新步骤:

 temp0 = theta(1,1) - (alpha/m)*sum((X*theta-y).*X(:,1)); temp1 = theta(2,1) - (alpha/m)*sum((X*theta-y).*X(:,2)); theta(1,1) = temp0; theta(2,1) = temp1;

However, I want to vectorize this code and to be able to apply it to any number of features.但是,我想矢量化此代码并能够将其应用于任意数量的功能。 For the vectorization part, it shows that what I am trying to do is a matrix multiplication对于矢量化部分,它表明我想要做的是矩阵乘法

theta = theta - (alpha/m) * (X' * (X*theta-y));

This is well seen, but when I tried, I realized that it doesn't work for gradient descent because the parameters are not updated simultaneously.这很好看,但是当我尝试时,我意识到它不适用于梯度下降,因为参数不是同时更新的。

Then, how can I vectorize this code and make sure the parameters and updated at the same time?那么,如何矢量化这段代码并确保参数和更新同时进行?

For the vectorized version try the following(two steps to make simultaneous update explicitly) :对于矢量化版本,请尝试以下操作(明确进行同步更新的两个步骤):

 gradient = (alpha/m) * X' * (X*theta -y)
 theta = theta - gradient

Your vectorization is correct.您的矢量化是正确的。 I also tried both of your code, and it got me the same theta.我也尝试了你的两个代码,它得到了相同的 theta。 Just remember don't use your updated theta in your second implementation.请记住不要在第二个实现中使用更新的 theta。

This also works but less simplified than your 2nd implementation:这也有效,但不如您的第二个实现简单:

Error = X * theta - y;
for i = 1:2
    S(i) = sum(Error.*X(:,i));
end

theta = theta - alpha * (1/m) * S'

In order to update them simultaneously you need to keep the value of theta(1..n) in temporary vector and after the operation just update values in original theta vector.为了同时更新它们,您需要将 theta(1..n) 的值保留在临时向量中,并且在操作之后只需更新原始 theta 向量中的值。

This is the code, that I use for this purpose:这是我用于此目的的代码:

Temp update临时更新

tempChange = zeros(length(theta), 1); tempChange = zeros(length(theta), 1);

tempChage = theta - (alpha/m) * (X' * (X*theta-y)); tempChage = theta - (alpha/m) * (X' * (X*theta-y));

Actual update实际更新

theta = tempChage; theta = tempChage;

theta = theta - (alpha/m) * (X') * ((X*theta)-y)

I am very new to this topic, still my opinion is: if you compute X*theta before hand then while doing vectorized operation to adjust theta, need not to be in temp.我对这个话题很陌生,但我的意见仍然是:如果您事先计算X*theta ,那么在进行矢量化操作以调整 theta 时,不需要处于临时状态。 in other words: if you compute X*theta while updating theta vector, theta(1) updates before theta(2) and hence changes the X*theta .换句话说:如果您在更新 theta 向量时计算X*theta ,则 theta(1) 在 theta(2) 之前更新,因此会更改X*theta but if we compute X*theta as y_pred and then do vectorize op on theta, it will be ok.但是如果我们将X*theta计算为 y_pred,然后对 theta 进行矢量化操作,那就没问题了。

so my suggestion is(without using temp):所以我的建议是(不使用温度):

y_pred = X*theta %theta is [1;1] and X is mX2 matrix
theta = theta - (alpha/m) * (X' * (y_pred-y));

Please correct me if I am wrong.如果我错了,请纠正我。

Here is the vectorized form of gradient descent it works for me in octave.这是梯度下降的矢量化形式,它在八度音阶中对我有用。
remember that X is a matrix with ones in the first column (since theta_0 *1 is thetha_0 ).请记住, X 是第一列中带有 1 的矩阵(因为theta_0 *1thetha_0 )。 For each column in X you have a feature(n) in X. Each row is a training set(m).对于 X 中的每一列,X 中有一个特征(n)。每一行是一个训练集(m)。 so X am X (n+1 ) matrix.所以 X am X (n+1 ) 矩阵。 The y column vector could be the house prices. y 列向量可能是房价。 Its good to have a cost function to check if you find a minimum.有一个成本函数来检查你是否找到最小值是很好的。
choose a value for alpha maybe a = 0.001 and try changing it for each time you run the code.为 alpha 选择一个值,可能是 a = 0.001,并在每次运行代码时尝试更改它。 The num_iters is the times you want it to run. num_iters是您希望它运行的次数。

function theta = gradientDescent(X, y, theta, alpha, num_iters)

m = length(y); % number of training examples


 for iter = 1:num_iters

  theta = theta - (alpha/m) * (X') * ((X*theta)-y)


 end

end

see the full explanation here:https://www.coursera.org/learn/machine-learning/resources/QQx8l在此处查看完整说明:https ://www.coursera.org/learn/machine-learning/resources/QQx8l

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM