scipy sparse matrix division

Question

I have been trying to divide a python scipy sparse matrix by a vector sum of its rows. Here is my code

sparse_mat = bsr_matrix((l_data, (l_row, l_col)), dtype=float)
sparse_mat = sparse_mat / (sparse_mat.sum(axis = 1)[:,None])

However, it throws an error no matter how I try it

sparse_mat = sparse_mat / (sparse_mat.sum(axis = 1)[:,None])
File "/usr/lib/python2.7/dist-packages/scipy/sparse/base.py", line 381, in __div__
return self.__truediv__(other)
File "/usr/lib/python2.7/dist-packages/scipy/sparse/compressed.py", line 427, in __truediv__
raise NotImplementedError
NotImplementedError

Anyone with an idea of where I am going wrong?

Answer 1

You can circumvent the problem by creating a sparse diagonal matrix from the reciprocals of your row sums and then multiplying it with your matrix. In the product the diagonal matrix goes left and your matrix goes right.

Example:

>>> a
array([[0, 9, 0, 0, 1, 0],
       [2, 0, 5, 0, 0, 9],
       [0, 2, 0, 0, 0, 0],
       [2, 0, 0, 0, 0, 0],
       [0, 9, 5, 3, 0, 7],
       [1, 0, 0, 8, 9, 0]])
>>> b = sparse.bsr_matrix(a)
>>> 
>>> c = sparse.diags(1/b.sum(axis=1).A.ravel())
>>> # on older scipy versions the offsets parameter (default 0)
... # is a required argument, thus
... # c = sparse.diags(1/b.sum(axis=1).A.ravel(), 0)
...
>>> a/a.sum(axis=1, keepdims=True)
array([[ 0.        ,  0.9       ,  0.        ,  0.        ,  0.1       ,  0.        ],
       [ 0.125     ,  0.        ,  0.3125    ,  0.        ,  0.        ,  0.5625    ],
       [ 0.        ,  1.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 1.        ,  0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.375     ,  0.20833333,  0.125     ,  0.        ,  0.29166667],
       [ 0.05555556,  0.        ,  0.        ,  0.44444444,  0.5       ,  0.        ]])
>>> (c @ b).todense() # on Python < 3.5 replace c @ b with c.dot(b)
matrix([[ 0.        ,  0.9       ,  0.        ,  0.        ,  0.1       ,  0.        ],
        [ 0.125     ,  0.        ,  0.3125    ,  0.        ,  0.        ,  0.5625    ],
        [ 0.        ,  1.        ,  0.        ,  0.        ,  0.        ,  0.        ],
        [ 1.        ,  0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
        [ 0.        ,  0.375     ,  0.20833333,  0.125     ,  0.        ,  0.29166667],
        [ 0.05555556,  0.        ,  0.        ,  0.44444444,  0.5       ,  0.        ]])

Answer 2

Something funny is going on. I have no problem performing the element division. I wonder if it's a Py2 issue. I'm using Py3.

In [1022]: A=sparse.bsr_matrix([[2,4],[1,2]])
In [1023]: A
Out[1023]: 
<2x2 sparse matrix of type '<class 'numpy.int32'>'
    with 4 stored elements (blocksize = 2x2) in Block Sparse Row format>
In [1024]: A.A
Out[1024]: 
array([[2, 4],
       [1, 2]], dtype=int32)
In [1025]: A.sum(axis=1)
Out[1025]: 
matrix([[6],
        [3]], dtype=int32)
In [1026]: A/A.sum(axis=1)
Out[1026]: 
matrix([[ 0.33333333,  0.66666667],
        [ 0.33333333,  0.66666667]])

or to try the other example:

In [1027]: b=sparse.bsr_matrix([[0, 9, 0, 0, 1, 0],
      ...:        [2, 0, 5, 0, 0, 9],
      ...:        [0, 2, 0, 0, 0, 0],
      ...:        [2, 0, 0, 0, 0, 0],
      ...:        [0, 9, 5, 3, 0, 7],
      ...:        [1, 0, 0, 8, 9, 0]])
In [1028]: b
Out[1028]: 
<6x6 sparse matrix of type '<class 'numpy.int32'>'
    with 14 stored elements (blocksize = 1x1) in Block Sparse Row format>
In [1029]: b.sum(axis=1)
Out[1029]: 
matrix([[10],
        [16],
        [ 2],
        [ 2],
        [24],
        [18]], dtype=int32)
In [1030]: b/b.sum(axis=1)
Out[1030]: 
matrix([[ 0.        ,  0.9       ,  0.        ,  0.        ,  0.1       , 0.        ],
        [ 0.125     ,  0.        ,  0.3125    ,  0.        ,  0.        , 0.5625    ],
 ....
        [ 0.05555556,  0.        ,  0.        ,  0.44444444,  0.5       ,     0.        ]])

The result of this sparse/dense is also dense, where as the c*b ( c is the sparse diagonal) is sparse.

In [1039]: c*b
Out[1039]: 
<6x6 sparse matrix of type '<class 'numpy.float64'>'
    with 14 stored elements in Compressed Sparse Row format>

The sparse sum is a dense matrix. It is 2d, so there's no need to expand it dimensions. In fact if I try that I get an error:

In [1031]: A/(A.sum(axis=1)[:,None])
....
ValueError: shape too large to be a matrix.

Answer 3

Per this message , to keep the matrix sparse, you access the data values and use the (nonzero) indices:

sums = np.asarray(A.sum(axis=1)).squeeze()  # this is dense
A.data /= sums[A.nonzero()[0]]

If dividing by the nonzero row mean instead of the sum, one can

nnz = A.getnnz(axis=1)  # this is also dense
means = sums / nnz
A.data /= means[A.nonzero()[0]]

scipy sparse matrix division

Question

3 answers

solution1
8 ACCPTED 2017-02-14 12:34:41

solution2
3 2017-02-14 17:31:35

solution3
0 2021-05-24 23:57:29

scipy sparse matrix division

Question

3 answers

solution1 8 ACCPTED 2017-02-14 12:34:41

solution2 3 2017-02-14 17:31:35

solution3 0 2021-05-24 23:57:29

solution1
8 ACCPTED 2017-02-14 12:34:41

solution2
3 2017-02-14 17:31:35

solution3
0 2021-05-24 23:57:29