Why is theta0 skipped while performing regulariztion on regression?

Question

I am currently learning ML on coursera with the help of course on ML by Andrew Ng. I am performing the assignments in python because I am more used to it rather than Matlab. I have recently come to a problem regarding my understanding of the topic of Regularization. My understanding is that by doing regularization, one can add less important features which are important enough in prediction. But while implementing it, I don't understand why the 1st element of theta(parameters) ie theta[0] is skipped while calculating the cost. I have referred other solutions but they also have done the same skipping w/o explanation.

Here is the code:

`

 term1 = np.dot(-np.array(y).T,np.log(h(theta,X)))
 term2 = np.dot((1-np.array(y)).T,np.log(1-h(theta,X)))
 regterm = (lambda_/2) * np.sum(np.dot(theta[1:].T,theta[1:])) #Skip theta0. Explain this line
 J=float( (1/m) * ( np.sum(term1 - term2) + regterm ) )
 grad=np.dot((sigmoid(np.dot(X,theta))-y),X)/m
 grad_reg=grad+((lambda_/m)*theta)
 grad_reg[0]=grad[0]

`

And here is the formula:

Here J(theta) is cost function h(x) is the sigmoid function or hypothesis. lamnda is the regularization parameter.

Answer 1

Theta0 is referring to bias. Bias comes in to picture when we want our decision boundaries to be separated properly. just consider an example of

Y1=w1 * X and then Y2= w2 * X

when the values of X comes close to zero, there could be a case when its a tough deal to separate them, here comes bias into the role.

Y1=w1 * X + b1 and Y2= w2 * X + b2

now, via learning, the decision boundaries will be clear all the time.

Let's consider why we use regularization now.

So that we don't over-fit, and smoothen the curve. As you can see the equation, its the slopes w1 and w2, that needs smoothening, bias are just the intercepts of segregation. So, there is no point of using them in regularization.

Although we can use it, in the case of neural networks it won't make any difference. But we might face the issues of reducing bias value so much, that it might confuse data points. Thus, it's better to not use Bias in Regularization.

Hope it answers your question. Originally published: https://medium.com/@shrutijadon10104776/why-we-dont-use-bias-in-regularization-5a86905dfcd6

Why is theta0 skipped while performing regulariztion on regression?

Question

1 answers

solution1
0 ACCPTED 2019-01-03 07:52:51

Why is theta0 skipped while performing regulariztion on regression?

Question

1 answers

solution1 0 ACCPTED 2019-01-03 07:52:51

solution1
0 ACCPTED 2019-01-03 07:52:51