简体繁体 English

python sklearn：“ sklearn.preprocessing.normalize（X，norm ='l2'）”和“ sklearn.svm.LinearSVC（penalty ='l2'）”之间有什么区别

[英]python sklearn: what is the different between “sklearn.preprocessing.normalize(X, norm='l2')” and “sklearn.svm.LinearSVC(penalty='l2')”

原文 2016-06-30 07:04:35 8 1 python/ scikit-learn

here is two method of normalize : 这是两种规格化方法：

1:this one is using in the data Pre-Processing: sklearn.preprocessing.normalize(X, norm='l2') 1：这是在数据预处理中使用的：sklearn.preprocessing.normalize（X，norm ='l2'）

2:the other method is using in the classify method : sklearn.svm.LinearSVC(penalty='l2') 2：分类方法中使用了另一种方法：sklearn.svm.LinearSVC（penalty ='l2'）

i want to know ,what is the different between them? 我想知道，它们之间有什么区别？ and does the two step must be used in a completely model ? 这两个步骤是否必须在完整模型中使用？ is it right that just use a method is enough? 仅使用一种方法就足够了吗？

1 个解决方案

These 2 are different things and you normally need them both in order to make a good SVC model. 这两个是不同的东西，通常您都需要它们，以便建立一个良好的SVC模型。

1) The first one means that in order to scale (normalize) the X data matrix you need to divide with the L2 norm of each column, which is just this : sqrt(sum(abs(X[:,j]).^2)) , where j is each column in your data matrix X . 1）第一个意味着要缩放（规格化）X数据矩阵，您需要用每列的L2范数除，就是这样： sqrt(sum(abs(X[:,j]).^2)) ，其中j是数据矩阵X中的每一列。 This ensures that none of the values of each column become too big, which makes it tough for some algorithms to converge. 这样可以确保每一列的值都不会太大，这使得某些算法难以收敛。

2) Irrespective of how scaled (and small in values) your data is, there still may be outliers or some features (j) that are way too dominant and your algorithm (LinearSVC()) may over trust them while it shouldn't. 2）无论您的数据如何缩放（以及值的大小），仍然可能存在异常值或某些特征（j）过于占主导地位，并且您的算法（LinearSVC（））可能过度信任它们，而不应该如此。 This is where L2 regularization comes into play , that says apart from the function the algorithm minimizes, a cost will be applied to the coefficients so that they don't become too big . 这就是L2正则化发挥作用的地方，也就是说，除了算法最小化的功能外，还会对系数施加成本，以使系数不会变得太大。 In other words the coefficients of the model become additional cost for the SVR cost function. 换句话说，模型的系数成为SVR成本函数的额外成本。 How much cost ? 多少费用？ is decided by the C (L2) value as C*(beta[j])^2 由C（L2）值决定为C*(beta[j])^2

To sum up, first one tells with which value to divide each column of the X matrix. 总结起来，第一个告诉用哪个值划分X矩阵的每一列。 How much weight should a coefficient burden the cost function with is the second. 第二个系数应该使成本函数负担多少权重。