简体   繁体   中英

numpy/scipy equivalent of MATLAB's sparse function

I'm porting a MATLAB code in Python with numpy and scipy and I need to use numpy/scipy equivalent of the sparse function in MATLAB.

Here's the usage of the sparse function in MATLAB,

sparse([3; 2], [2; 4], [3; 0])

gives:

Trial>> m = sparse([3; 2], [2; 4], [3; 0])

    m =

       (3,2)        3

    Trial>> full(m)

    ans =

         0     0     0     0
         0     0     0     0
         0     3     0     0

I have these, but they don't give what MATLAB version does,

sps.csr_matrix([3, 2], [2, 4], [3, 0])
sps.csr_matrix(np.array([[3], [2]]), np.array([[2], [4]]), np.array([[3], [0]])) 
sps.csr_matrix([[3], [2]], [[2], [4]], [[3], [0]])  

Any ideas? Thanks.

You're using the sparse(I, J, SV) form [note: link goes to documentation for GNU Octave, not Matlab]. The scipy.sparse equivalent is csr_matrix((SV, (I, J))) -- yes, a single argument which is a 2-tuple containing a vector and a 2-tuple of vectors. You also have to correct the index vectors because Python consistently uses 0-based indexing.

>>> m = sps.csr_matrix(([3,0], ([2,1], [1,3]))); m
<3x4 sparse matrix of type '<class 'numpy.int64'>'
    with 2 stored elements in Compressed Sparse Row format>

>>> m.todense()
matrix([[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 3, 0, 0]], dtype=int64)

Note that scipy, unlike Matlab, does not automatically discard explicit zeroes, and will use integer storage for matrices containing only integers. To perfectly match the matrix you got in Matlab, you must explicitly ask for floating-point storage and you must call eliminate_zeros() on the result:

>>> m2 = sps.csr_matrix(([3,0], ([2,1], [1,3])), dtype=np.float)
>>> m2.eliminate_zeros()
>>> m2
<3x4 sparse matrix of type '<class 'numpy.float64'>'
    with 1 stored elements in Compressed Sparse Row format>
>>> m2.todense()
matrix([[ 0.,  0.,  0.,  0.],
        [ 0.,  0.,  0.,  0.],
        [ 0.,  3.,  0.,  0.]])

You could also change [3,0] to [3., 0.] but I recommend an explicit dtype= argument because that will prevent surprises when you are feeding in real data.

(I don't know what Matlab's internal sparse matrix representation is, but Octave appears to default to compressed sparse column representation. The difference between CSC and CSR should only affect performance. If your NumPy code winds up being slower than your Matlab code, try using sps.csc_matrix instead of csr_matrix , as well as all the usual NumPy performance tips.)

(You probably need to read NumPy for Matlab users if you haven't already.)

here a conversion I made. It is working for the 5 arguments version of sparse.

def sparse(i, j, v, m, n):
    """
    Create and compressing a matrix that have many zeros
    Parameters:
        i: 1-D array representing the index 1 values 
            Size n1
        j: 1-D array representing the index 2 values 
            Size n1
        v: 1-D array representing the values 
            Size n1
        m: integer representing x size of the matrix >= n1
        n: integer representing y size of the matrix >= n1
    Returns:
        s: 2-D array
            Matrix full of zeros excepting values v at indexes i, j
    """
    return scipy.sparse.csr_matrix((v, (i, j)), shape=(m, n))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM