简体   繁体   English

将大型scipy稀疏矩阵的多个行和列设置为0s

[英]Set multiple rows and columns of large scipy sparse matrix to 0s

I'm trying to "translate" some Matlab code into Python and there is one line in Matlab that sets multiple rows in a sparse matrix to 0s: 我正在尝试将一些Matlab代码“翻译”成Python,而Matlab中有一行将稀疏矩阵中的多行设置为0:

Ks(idx,:)=0; %no affinity for W inside fs

where Ks is the sparse matrix (which is symmetrical and with pretty big size), and idx is an 1D array denoting the row indices to do the changes and it's also quite big in size. 其中Ks是稀疏矩阵(对称且具有很大的大小),而idx是一维数组,表示要进行更改的行索引,并且大小也很大。 And in the next line it also changes those columns to 0s, so Ks is still symmetric: 在下一行中,它还将这些列更改为0,因此Ks仍然是对称的:

Ks(:,idx)=0;

Doing the similar thing in Python ( Ks[idx,:]=0 ) seems to only work for small matrices, when it gets big I got MemoryError . 在Python中做类似的事情( Ks[idx,:]=0 )似乎只适用于小型矩阵,当它变大时,我得到MemoryError Currently my Ks is a csr matrix, converting it to lil and do that is super slow. 目前,我的Ks是一个csr矩阵,将其转换为lil ,这样做非常慢。

I'm not quite familiar with sparse matrices, I know that in Python there are more than 1 type (eg csr, csc, lil etc.), but in the Matlab code there aren't such distinctions, I only found a function call of sparse() . 我对稀疏矩阵不太熟悉,我知道在Python中有不止一种类型(例如,csr,csc,lil等),但是在Matlab代码中没有这种区别,我只发现了一个函数调用的sparse() So what's my best bet in this situation? 那么在这种情况下我最好的选择是什么?

Thanks in advance. 提前致谢。

One way to speed up is , instead of setting the sparse matrix elements to zero , first set the elements of numpy nd array to zero , and then convert to sparse matrix. 一种加快速度的方法是,不是将稀疏矩阵元素设置为零,而是将numpy nd数组的元素设置为零,然后转换为稀疏矩阵。 I got a speed boost of more than 10 times in the below example. 在下面的示例中,我将速度提高了10倍以上。

import numpy as np
import scipy.sparse as sps
np.random.seed(20)
mat = np.random.randint(-2000,2000,size=(1000,1000))
sym_mat = (mat + mat.T)/2

zero_rows =  np.random.randint(0,999,(900,))


%%timeit
sparse = sps.csr_matrix(sym_mat)
sparse[zero_rows,:] = 0
sparse[:,zero_rows] = 0

/usr/local/lib/python3.6/dist-packages/scipy/sparse/compressed.py:774: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
  SparseEfficiencyWarning)
1 loop, best of 3: 206 ms per loop

%%timeit
sym_mat[zero_rows,:] = 0
sym_mat[:,zero_rows] = 0
sparse1 = sps.csr_matrix(sym_mat)

100 loops, best of 3: 18.9 ms per loop

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM