简体   繁体   English

python sklearn:“ sklearn.preprocessing.normalize(X,norm ='l2')”和“ sklearn.svm.LinearSVC(penalty ='l2')”之间有什么区别

[英]python sklearn: what is the different between “sklearn.preprocessing.normalize(X, norm='l2')” and “sklearn.svm.LinearSVC(penalty='l2')”

here is two method of normalize : 这是两种规格化方法:

1:this one is using in the data Pre-Processing: sklearn.preprocessing.normalize(X, norm='l2') 1:这是在数据预处理中使用的:sklearn.preprocessing.normalize(X,norm ='l2')

2:the other method is using in the classify method : sklearn.svm.LinearSVC(penalty='l2') 2:分类方法中使用了另一种方法:sklearn.svm.LinearSVC(penalty ='l2')

i want to know ,what is the different between them? 我想知道,它们之间有什么区别? and does the two step must be used in a completely model ? 这两个步骤是否必须在完整模型中使用? is it right that just use a method is enough? 仅使用一种方法就足够了吗?

These 2 are different things and you normally need them both in order to make a good SVC model. 这两个是不同的东西,通常您都需要它们,以便建立一个良好的SVC模型。

1) The first one means that in order to scale (normalize) the X data matrix you need to divide with the L2 norm of each column, which is just this : sqrt(sum(abs(X[:,j]).^2)) , where j is each column in your data matrix X . 1)第一个意味着要缩放(规格化)X数据矩阵,您需要用每列的L2范数除,就是这样: sqrt(sum(abs(X[:,j]).^2)) ,其中j是数据矩阵X中的每一列。 This ensures that none of the values of each column become too big, which makes it tough for some algorithms to converge. 这样可以确保每一列的值都不会太大,这使得某些算法难以收敛。

2) Irrespective of how scaled (and small in values) your data is, there still may be outliers or some features (j) that are way too dominant and your algorithm (LinearSVC()) may over trust them while it shouldn't. 2)无论您的数据如何缩放(以及值的大小),仍然可能存在异常值或某些特征(j)过于占主导地位,并且您的算法(LinearSVC())可能过度信任它们,而不应该如此。 This is where L2 regularization comes into play , that says apart from the function the algorithm minimizes, a cost will be applied to the coefficients so that they don't become too big . 这就是L2正则化发挥作用的地方,也就是说,除了算法最小化的功能外,还会对系数施加成本,以使系数不会变得太大。 In other words the coefficients of the model become additional cost for the SVR cost function. 换句话说,模型的系数成为SVR成本函数的额外成本。 How much cost ? 多少费用 ? is decided by the C (L2) value as C*(beta[j])^2 由C(L2)值决定为C*(beta[j])^2

To sum up, first one tells with which value to divide each column of the X matrix. 总结起来,第一个告诉用哪个值划分X矩阵的每一列。 How much weight should a coefficient burden the cost function with is the second. 第二个系数应该使成本函数负担多少权重。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 sklearn.preprocessing.normalize中的norm ='l2'对于矩阵归一化有什么作用? - What does norm='l2' in sklearn.preprocessing.normalize do for matrix normalization? 在python中使用L2范数的LAD? (sklearn) - LAD with L2 norm in python? (sklearn) sklearn.preprocessing.normalize考虑哪个L1规范? - Which L1 norm does sklearn.preprocessing.normalize consider? numpy.linalg.norm是否可以将sklearn.preprocessing.normalize(X,norm ='l1',)替换为矩阵的L1-norm? - Can numpy.linalg.norm replace sklearn.preprocessing.normalize(X, norm='l1',) for L1-norm of matrix? 如何使用 Sklearn 在 Python 中对列表列表进行 L2 规范化 - How to L2 Normalize a list of lists in Python using Sklearn sklearn.preprocessing.normalize 中的规范参数 - norm parameters in sklearn.preprocessing.normalize scipy.linalg.norm与sklearn.preprocessing.normalize不同吗? - scipy.linalg.norm different from sklearn.preprocessing.normalize? 在sklearn python中撤消L2规范化 - Undo L2 Normalization in sklearn python 为什么sklearn.LogisticRegression的刑罚='l1'和'l2'且C = 1e80的准确性之间存在差异? - Why there is a difference between the accuracy of sklearn.LogisticRegression with penalty='l1' and 'l2' and C=1e80? 为什么MSE sklearn库给我的平方误差不同于l2范数平方误差? - Why does the MSE sklearn library give me a different squared error compared to the l2 norm squared error?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM