scipy.linalg.norm与sklearn.preprocessing.normalize不同吗？

Question

from numpy.random import rand
from sklearn.preprocessing import normalize
from scipy.sparse import csr_matrix
from scipy.linalg import norm

w = (rand(1,10)<0.25)*rand(1,10)
x = (rand(1,10)<0.25)*rand(1,10)
w_csr = csr_matrix(w)
x_csr = csr_matrix(x)
(normalize(w_csr,axis=1,copy=False,norm='l2')*normalize(x_csr,axis=1,copy=False,norm='l2')).todense()

norm(w,ord='fro')*norm(x,ord='fro')

I am working with scipy csr_matrix and would like to normalize two matrices using the frobenius norm and get their product. 我正在使用scipy csr_matrix，并希望使用frobenius范数规范化两个矩阵并获得其乘积。 But norm from scipy.linalg and normalize from sklearn.preprocessing seem to evaluate the matrices differently. 但是，来自scipy.linalg的规范和来自sklearn.preprocessing的规范化似乎对矩阵进行了不同的评估。 Since technically in the above two cases I am calculating the same frobenius norm shouldn't the two expressions evaluate to the same thing? 由于从技术上讲，在上述两种情况下，我正在计算相同的frobenius范数，所以两个表达式不应该求同一个值吗？ But I get the following answer: 但是我得到以下答案：

matrix([[ 0.962341]]) 矩阵（[[0.962341]]）

0.4431811178371029 0.4431811178371029

for sklearn.preprocessing and scipy.linalg.norm respectively. 分别用于sklearn.preprocessing和scipy.linalg.norm。 I am really interested to know what I am doing wrong. 我真的很想知道我在做什么错。

Answer 1

sklearn.prepocessing.normalize divides each row by its norm. sklearn.prepocessing.normalize 将每一行除以其范数。 It returns a matrix with the same shape as its input. 它返回与输入形状相同的矩阵。 scipy.linalg.norm returns the norm of the matrix. scipy.linalg.norm返回矩阵的范数。 So your calculations are not equivalent. 因此，您的计算并不等效。

Note that your code is not correct as it is written. 请注意，您编写的代码不正确。 This line 这条线

(normalize(w_csr,axis=1,copy=False,norm='l2')*normalize(x_csr,axis=1,copy=False,norm='l2')).todense()

raises ValueError: dimension mismatch . 引发ValueError: dimension mismatch 。 The two calls to normalize both return matrices with shapes (1, 10), so their dimensions are not compatible for a matrix product. 这两个调用均对形状为（ normalize两个返回矩阵进行normalize ，因此它们的尺寸与矩阵乘积不兼容。 What did you do to get matrix([[ 0.962341]]) ? 您做了什么来获取matrix([[ 0.962341]]) ？

Here's a simple function to compute the Frobenius norm of a sparse (eg CSR or CSC) matrix: 这是一个用于计算稀疏（例如CSR或CSC）矩阵的Frobenius范数的简单函数：

def spnorm(a):
    return np.sqrt(((a.data**2).sum()))

For example, 例如，

In [182]: b_csr
Out[182]: 
<3x5 sparse matrix of type '<type 'numpy.float64'>'
with 5 stored elements in Compressed Sparse Row format>

In [183]: b_csr.A
Out[183]: 
array([[ 1.,  0.,  0.,  0.,  0.],
       [ 0.,  2.,  0.,  4.,  0.],
       [ 0.,  0.,  0.,  2.,  1.]])

In [184]: spnorm(b_csr)
Out[184]: 5.0990195135927845

In [185]: norm(b_csr.A)
Out[185]: 5.0990195135927845

scipy.linalg.norm与sklearn.preprocessing.normalize不同吗？

问题描述

1 个解决方案

解决方案1
1 已采纳 2013-12-05 15:33:19

scipy.linalg.norm与sklearn.preprocessing.normalize不同吗？

问题描述

1 个解决方案

解决方案1 1 已采纳 2013-12-05 15:33:19

解决方案1
1 已采纳 2013-12-05 15:33:19