简体   繁体   中英

Why is theta0 skipped while performing regulariztion on regression?

I am currently learning ML on coursera with the help of course on ML by Andrew Ng. I am performing the assignments in python because I am more used to it rather than Matlab. I have recently come to a problem regarding my understanding of the topic of Regularization. My understanding is that by doing regularization, one can add less important features which are important enough in prediction. But while implementing it, I don't understand why the 1st element of theta(parameters) ie theta[0] is skipped while calculating the cost. I have referred other solutions but they also have done the same skipping w/o explanation.

Here is the code:

`

 term1 = np.dot(-np.array(y).T,np.log(h(theta,X)))
 term2 = np.dot((1-np.array(y)).T,np.log(1-h(theta,X)))
 regterm = (lambda_/2) * np.sum(np.dot(theta[1:].T,theta[1:])) #Skip theta0. Explain this line
 J=float( (1/m) * ( np.sum(term1 - term2) + regterm ) )
 grad=np.dot((sigmoid(np.dot(X,theta))-y),X)/m
 grad_reg=grad+((lambda_/m)*theta)
 grad_reg[0]=grad[0]

`

And here is the formula:

正则成本函数

Here J(theta) is cost function h(x) is the sigmoid function or hypothesis. lamnda is the regularization parameter.

Theta0 is referring to bias. Bias comes in to picture when we want our decision boundaries to be separated properly. just consider an example of

Y1=w1 * X and then Y2= w2 * X

when the values of X comes close to zero, there could be a case when its a tough deal to separate them, here comes bias into the role.

Y1=w1 * X + b1 and Y2= w2 * X + b2

now, via learning, the decision boundaries will be clear all the time.

Let's consider why we use regularization now.

So that we don't over-fit, and smoothen the curve. As you can see the equation, its the slopes w1 and w2, that needs smoothening, bias are just the intercepts of segregation. So, there is no point of using them in regularization.

Although we can use it, in the case of neural networks it won't make any difference. But we might face the issues of reducing bias value so much, that it might confuse data points. Thus, it's better to not use Bias in Regularization.

Hope it answers your question. Originally published: https://medium.com/@shrutijadon10104776/why-we-dont-use-bias-in-regularization-5a86905dfcd6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM