简体   繁体   English

scipy 稀疏矩阵除法

[英]scipy sparse matrix division

I have been trying to divide a python scipy sparse matrix by a vector sum of its rows.我一直在尝试将 python scipy 稀疏矩阵除以其行的向量和。 Here is my code这是我的代码

sparse_mat = bsr_matrix((l_data, (l_row, l_col)), dtype=float)
sparse_mat = sparse_mat / (sparse_mat.sum(axis = 1)[:,None])

However, it throws an error no matter how I try it但是,无论我如何尝试它都会引发错误

sparse_mat = sparse_mat / (sparse_mat.sum(axis = 1)[:,None])
File "/usr/lib/python2.7/dist-packages/scipy/sparse/base.py", line 381, in __div__
return self.__truediv__(other)
File "/usr/lib/python2.7/dist-packages/scipy/sparse/compressed.py", line 427, in __truediv__
raise NotImplementedError

Anyone with an idea of where I am going wrong?任何人都知道我哪里出错了?

You can circumvent the problem by creating a sparse diagonal matrix from the reciprocals of your row sums and then multiplying it with your matrix.您可以通过从行总和的倒数创建稀疏对角矩阵,然后将其与矩阵相乘来规避该问题。 In the product the diagonal matrix goes left and your matrix goes right.在乘积中,对角矩阵向左移动,您的矩阵向右移动。


>>> a
array([[0, 9, 0, 0, 1, 0],
       [2, 0, 5, 0, 0, 9],
       [0, 2, 0, 0, 0, 0],
       [2, 0, 0, 0, 0, 0],
       [0, 9, 5, 3, 0, 7],
       [1, 0, 0, 8, 9, 0]])
>>> b = sparse.bsr_matrix(a)
>>> c = sparse.diags(1/b.sum(axis=1).A.ravel())
>>> # on older scipy versions the offsets parameter (default 0)
... # is a required argument, thus
... # c = sparse.diags(1/b.sum(axis=1).A.ravel(), 0)
>>> a/a.sum(axis=1, keepdims=True)
array([[ 0.        ,  0.9       ,  0.        ,  0.        ,  0.1       ,  0.        ],
       [ 0.125     ,  0.        ,  0.3125    ,  0.        ,  0.        ,  0.5625    ],
       [ 0.        ,  1.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 1.        ,  0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.375     ,  0.20833333,  0.125     ,  0.        ,  0.29166667],
       [ 0.05555556,  0.        ,  0.        ,  0.44444444,  0.5       ,  0.        ]])
>>> (c @ b).todense() # on Python < 3.5 replace c @ b with c.dot(b)
matrix([[ 0.        ,  0.9       ,  0.        ,  0.        ,  0.1       ,  0.        ],
        [ 0.125     ,  0.        ,  0.3125    ,  0.        ,  0.        ,  0.5625    ],
        [ 0.        ,  1.        ,  0.        ,  0.        ,  0.        ,  0.        ],
        [ 1.        ,  0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
        [ 0.        ,  0.375     ,  0.20833333,  0.125     ,  0.        ,  0.29166667],
        [ 0.05555556,  0.        ,  0.        ,  0.44444444,  0.5       ,  0.        ]])

Something funny is going on.有趣的事情正在发生。 I have no problem performing the element division.我执行元素划分没有问题。 I wonder if it's a Py2 issue.我想知道这是不是 Py2 问题。 I'm using Py3.我正在使用 Py3。

In [1022]: A=sparse.bsr_matrix([[2,4],[1,2]])
In [1023]: A
<2x2 sparse matrix of type '<class 'numpy.int32'>'
    with 4 stored elements (blocksize = 2x2) in Block Sparse Row format>
In [1024]: A.A
array([[2, 4],
       [1, 2]], dtype=int32)
In [1025]: A.sum(axis=1)
        [3]], dtype=int32)
In [1026]: A/A.sum(axis=1)
matrix([[ 0.33333333,  0.66666667],
        [ 0.33333333,  0.66666667]])

or to try the other example:或尝试另一个示例:

In [1027]: b=sparse.bsr_matrix([[0, 9, 0, 0, 1, 0],
      ...:        [2, 0, 5, 0, 0, 9],
      ...:        [0, 2, 0, 0, 0, 0],
      ...:        [2, 0, 0, 0, 0, 0],
      ...:        [0, 9, 5, 3, 0, 7],
      ...:        [1, 0, 0, 8, 9, 0]])
In [1028]: b
<6x6 sparse matrix of type '<class 'numpy.int32'>'
    with 14 stored elements (blocksize = 1x1) in Block Sparse Row format>
In [1029]: b.sum(axis=1)
        [ 2],
        [ 2],
        [18]], dtype=int32)
In [1030]: b/b.sum(axis=1)
matrix([[ 0.        ,  0.9       ,  0.        ,  0.        ,  0.1       , 0.        ],
        [ 0.125     ,  0.        ,  0.3125    ,  0.        ,  0.        , 0.5625    ],
        [ 0.05555556,  0.        ,  0.        ,  0.44444444,  0.5       ,     0.        ]])

The result of this sparse/dense is also dense, where as the c*b ( c is the sparse diagonal) is sparse.这种稀疏/密集的结果也是密集的,其中c*bc是稀疏对角线)是稀疏的。

In [1039]: c*b
<6x6 sparse matrix of type '<class 'numpy.float64'>'
    with 14 stored elements in Compressed Sparse Row format>

The sparse sum is a dense matrix.稀疏和是一个密集矩阵。 It is 2d, so there's no need to expand it dimensions.它是二维的,所以不需要扩展它的尺寸。 In fact if I try that I get an error:事实上,如果我尝试,我会得到一个错误:

In [1031]: A/(A.sum(axis=1)[:,None])
ValueError: shape too large to be a matrix.

Per this message , to keep the matrix sparse, you access the data values and use the (nonzero) indices:根据此消息,为了保持矩阵稀疏,您可以访问数据值并使用(非零)索引:

sums = np.asarray(A.sum(axis=1)).squeeze()  # this is dense
A.data /= sums[A.nonzero()[0]]

If dividing by the nonzero row mean instead of the sum, one can如果除以非零行平均值而不是总和,则可以

nnz = A.getnnz(axis=1)  # this is also dense
means = sums / nnz
A.data /= means[A.nonzero()[0]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM