简体   繁体   中英

Sparse matrix with fast access

While working with large SciPy CSR sparse matrices I noticed that slicing the matrix to get a single row from the matrix was very slow as it seems to make a copy.

Is there any way to make a sparse matrix that takes a reference of the existing row instead of copying it, perhaps there is a more fitting implementation than CSR matrix?

What I need for my implementation is fast lookup for elements and rows and fast lookup of all non zero indices of a vector. I never need to change the matrix in any way or perform other operations on the matrix.

You can take advantage of the CSR representation to slice the underlying arrays directly and share the data with a new CSR matrix:

mat = # some CSR matrix
i = # the index of whatever row you want
start, stop = mat.indptr[i], mat.indptr[i+1]
noncopy_row_i = scipy.sparse.csr_matrix((mat.data[start:stop],
                                         mat.indices[start:stop],
                                         numpy.array([0, stop-start])),
                                        shape=(1, mat.shape[1]))

Numpy supports different types os sparces matrices: https://docs.scipy.org/doc/scipy/reference/sparse.html#usage-information

May be coo_matrix will give faster elements lookup, but also you can loose on some other operations. I think the best way is to benchmark on your data and algorythms.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM