[英]Regularized logistic regression code in matlab
我正在嘗試使用正規化的LR,在matlab中使用這個公式很簡單:
成本函數:
J(theta) = 1/m*sum((-y_i)*log(h(x_i)-(1-y_i)*log(1-h(x_i))))+(lambda/2*m)*sum(theta_j)
漸變:
∂J(theta)/∂theta_0 = [(1/m)*(sum((h(x_i)-y_i)*x_j)] if j=0
∂j(theta)/∂theta_n = [(1/m)*(sum((h(x_i)-y_i)*x_j)]+(lambda/m)*(theta_j) if j>1
這不是matlab代碼只是公式。
到目前為止我已經這樣做了:
function [J, grad] = costFunctionReg(theta, X, y, lambda)
J = 0;
grad = zeros(size(theta));
temp_theta = [];
%cost function
%get the regularization term
for jj = 2:length(theta)
temp_theta(jj) = theta(jj)^2;
end
theta_reg = lambda/(2*m)*sum(temp_theta);
temp_sum =[];
%for the sum in the cost function
for ii =1:m
temp_sum(ii) = -y(ii)*log(sigmoid(theta'*X(ii,:)'))-(1-y(ii))*log(1-sigmoid(theta'*X(ii,:)'));
end
tempo = sum(temp_sum);
J = (1/m)*tempo+theta_reg;
%regulatization
%theta 0
reg_theta0 = 0;
for jj=1:m
reg_theta0(jj) = (sigmoid(theta'*X(m,:)') -y(jj))*X(jj,1)
end
reg_theta0 = (1/m)*sum(reg_theta0)
grad_temp(1) = reg_theta0
%for the rest of thetas
reg_theta = [];
thetas_sum = 0;
for ii=2:size(theta)
for kk =1:m
reg_theta(kk) = (sigmoid(theta'*X(m,:)') - y(kk))*X(kk,ii)
end
thetas_sum(ii) = (1/m)*sum(reg_theta)+(lambda/m)*theta(ii)
reg_theta = []
end
for i=1:size(theta)
if i == 1
grad(i) = grad_temp(i)
else
grad(i) = thetas_sum(i)
end
end
end
成本函數給出正確的結果,但我不知道為什么梯度(一步)不是,成本給出J = 0.6931這是正確的,梯度grad = 0.3603 -0.1476 0.0320,這不是,成本開始從2開始,因為參數theta(1)不必正則化,有什么幫助嗎? 我猜這個代碼有問題,但4天后我看不到它。謝謝
矢量:
function [J, grad] = costFunctionReg(theta, X, y, lambda)
hx = sigmoid(X * theta);
m = length(X);
J = (sum(-y' * log(hx) - (1 - y')*log(1 - hx)) / m) + lambda * sum(theta(2:end).^2) / (2*m);
grad =((hx - y)' * X / m)' + lambda .* theta .* [0; ones(length(theta)-1, 1)] ./ m ;
end
我使用了更多變量,因此您可以清楚地看到常規公式的內容,以及“增加的正則化成本”。 另外,在Matlab / Octave中使用“矢量化”代替循環是一種很好的做法。 通過這樣做,您可以保證更優化的解決方案。
function [J, grad] = costFunctionReg(theta, X, y, lambda)
%Hypotheses
hx = sigmoid(X * theta);
%%The cost without regularization
J_partial = (-y' * log(hx) - (1 - y)' * log(1 - hx)) ./ m;
%%Regularization Cost Added
J_regularization = (lambda/(2*m)) * sum(theta(2:end).^2);
%%Cost when we add regularization
J = J_partial + J_regularization;
%Grad without regularization
grad_partial = (1/m) * (X' * (hx -y));
%%Grad Cost Added
grad_regularization = (lambda/m) .* theta(2:end);
grad_regularization = [0; grad_regularization];
grad = grad_partial + grad_regularization;
最后得到它,在第四次重寫之后,這是正確的代碼:
function [J, grad] = costFunctionReg(theta, X, y, lambda)
J = 0;
grad = zeros(size(theta));
temp_theta = [];
for jj = 2:length(theta)
temp_theta(jj) = theta(jj)^2;
end
theta_reg = lambda/(2*m)*sum(temp_theta);
temp_sum =[];
for ii =1:m
temp_sum(ii) = -y(ii)*log(sigmoid(theta'*X(ii,:)'))-(1-y(ii))*log(1-sigmoid(theta'*X(ii,:)'));
end
tempo = sum(temp_sum);
J = (1/m)*tempo+theta_reg;
%regulatization
%theta 0
reg_theta0 = 0;
for i=1:m
reg_theta0(i) = ((sigmoid(theta'*X(i,:)'))-y(i))*X(i,1)
end
theta_temp(1) = (1/m)*sum(reg_theta0)
grad(1) = theta_temp
sum_thetas = []
thetas_sum = []
for j = 2:size(theta)
for i = 1:m
sum_thetas(i) = ((sigmoid(theta'*X(i,:)'))-y(i))*X(i,j)
end
thetas_sum(j) = (1/m)*sum(sum_thetas)+((lambda/m)*theta(j))
sum_thetas = []
end
for z=2:size(theta)
grad(z) = thetas_sum(z)
end
% =============================================================
end
如果它可以幫助任何人,或者任何人對我如何做得更好有任何意見。 :)
這是一個消除循環的答案
m = length(y); % number of training examples
predictions = sigmoid(X*theta);
reg_term = (lambda/(2*m)) * sum(theta(2:end).^2);
calcErrors = -y.*log(predictions) - (1 -y).*log(1-predictions);
J = (1/m)*sum(calcErrors)+reg_term;
% prepend a 0 column to our reg_term matrix so we can use simple matrix addition
reg_term = [0 (lambda*theta(2:end)/m)'];
grad = sum(X.*(predictions - y)) / m + reg_term;
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.