[英]Compute the Loss of L1 and L2 regularization
How to calculate the loss of L1
and L2
regularization where w
is a vector of weights of the linear model in Python?如何计算
L1
和L2
正则化的损失,其中w
是 Python 中线性 model 的权重向量?
The regularizes shall compute the loss without considering the bias term in the weights正则化应在不考虑权重中的偏差项的情况下计算损失
def l1_reg(w):
# TO-DO: Add your code here
return None
def l2_reg(w):
# TO-DO: Add your code here
return None
While train your model you would like to get a higher accuracy as possible.therefore, you might choose all correlated features [ columns , predictors , vectors ], but, in case of the dataset you have not big enough (ie number of features, n
much larger than m
), this causes what's called by overfitting.在训练 model 时,您希望尽可能获得更高的准确度。因此,您可以选择所有相关特征 [列、预测变量、向量],但是,如果数据集不够大(即特征数量,
n
比m
大得多),这会导致所谓的过度拟合。 Overfitting describe that your model performs very well in a training set , but fail in the test set (ie training accuracy is much better compared with the test set accuracy ), you can think of it, that you can solve a problem, that you have been solved before, but can't solve a similar problem, because you overthinking [ Not same problem but similar ],so here regularization come to solve this problem.过拟合描述你的model在训练集中表现非常好,但是在测试集中失败(即训练精度比测试集精度好很多),你可以想到,你可以解决一个问题,你有以前解决过,但是不能解决类似的问题,因为你想多了[ Not same problem but similar ],所以这里正则化来解决这个问题。
Let's frist explain the logic term behied Regularization.让我们首先解释正则化背后的逻辑术语。
Regularization the process of adding information [ You can think of it, before giving you another problem, i add more information to first one, you categorized it, so you just not overthinking if you find similar problem ].正则化添加信息的过程[你可以想到,在给你另一个问题之前,我在第一个问题上添加了更多信息,你对它进行了分类,所以如果你发现类似的问题你就不要想太多了]。
This image show overfitted model and acurate model.这张图片显示了过度拟合的 model 和精确的 model。
L1 & L2 are the types of information added to your model equation L1和L2是添加到 model 方程的信息类型
In L1 you add information to model equation to be the absolute sum of theta vector (θ) multiply by the regularization parameter (λ) which could be any large number over size of data (m), where (n) is the number of features.在 L1 中,您将信息添加到 model 方程为 theta 向量 (θ) 乘以正则化参数 (λ) 的绝对和,正则化参数 (λ) 可以是超过数据大小 (m) 的任意大数,其中 (n) 是特征数.
In L2, you add the information to model equation to be the sum of vector (θ) squared multiplied by the regularization parameter (λ) which can be any big number over size of data (m), which (n) is a number of features.在 L2 中,将信息添加到 model 方程中,得到向量 (θ) 平方乘以正则化参数 (λ) 的总和,该正则化参数可以是任何大于数据大小 (m) 的大数,其中 (n) 是特征。
Then L2 Regularization going to be (n+1)x(n+1) diagonal matrix with a zero in the upper left and ones down the other diagonal entries multiply by the regularization parameter(λ).然后 L2 正则化将是 (n+1)x(n+1) 对角矩阵,左上角为零,其他对角线项下方的对角矩阵乘以正则化参数 (λ)。
I think it is important to clarify this before answering: the L1 and L2 regularization terms aren't loss functions.我认为在回答之前澄清这一点很重要: L1和L2正则化项不是损失函数。 They help to control the weights in the vector so that they don't become too large and can reduce overfitting.
它们有助于控制向量中的权重,使它们不会变得太大并且可以减少过度拟合。
L1 regularization term is the sum of absolute values of each element. L1正则化项是每个元素的绝对值之和。 For a length N vector, it would be
|w[1]| + |w[2]| +... + |w[N]|.
对于长度为 N 的向量,它将是
|w[1]| + |w[2]| +... + |w[N]|.
|w[1]| + |w[2]| +... + |w[N]|.
L2 regularization term is the sum of squared values of each element. L2正则化项是每个元素的平方值之和。 For a length N vector, it would be
w[1]² + w[2]² +... + w[N]²
.对于长度为 N 的向量,它将是
w[1]² + w[2]² +... + w[N]²
。 I hope this helps!我希望这有帮助!
def calculateL2(self, vector):
return np.dot(vector, vector)
def calculateL1(self, vector):
vector = np.abs(vector)
return np.sum(vector)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.