[英]Slicing a sparse scipy matrix to subsample for every 10th row and column

I am trying to subsample a scipy sparse matrix as a numpy matrix like this to get every 10th row and every 10th column: 我正在尝试像这样的numpy矩阵对scipy稀疏矩阵进行二次采样,以获取第10行和第10列:

connections = sparse.csr_matrix((data,(node1_index,node2_index)),
connections_sampled = np.zeros((dimensions/10, dimensions/10))
connections_sampled = connections[::10,::10]

However, when I run this and and query the shape of connections_sampled, I get the original dimensions of connections instead of dimensions that have been reduced by a factor of 10. 但是,当我运行此命令并查询connections_sampled的形状时,我得到的是连接的原始尺寸,而不是减小了10倍的尺寸。

Does this type of subsampling now work with sparse matrices? 这种类型的子采样现在适用于稀疏矩阵吗? It seems to work when I use smaller matrices, but I can't get this to give the correct answer. 当我使用较小的矩阵时,这似乎可以工作,但是我无法给出正确的答案。

You cannot sample every 10th row and column of a CSR matrix, not in Scipy 0.12 at least: 您不能对CSR矩阵的第10行和第10行采样,至少不能在Scipy 0.12中采样:

>>> import scipy.sparse as sps
>>> a = sps.rand(1000, 1000, format='csr')
>>> a[::10, ::10]
Traceback (most recent call last):
ValueError: slicing with step != 1 not supported

You can do it, though, by converting first to a LIL format matrix: 不过,您可以通过首先将其转换为LIL格式矩阵来实现:

>>> a.tolil()[::10, ::10]
<100x100 sparse matrix of type '<type 'numpy.float64'>'
    with 97 stored elements in LInked List format>

As you see, the shape is updated correctly. 如您所见,形状已正确更新。 If you want a numpy array, not a sparse matrix, try: 如果您想要一个numpy数组,而不是一个稀疏矩阵,请尝试:

>>> a.tolil()[::10, ::10].A
array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.]])

