简体   繁体   English

在csc_matrix中查找零元素的行索引

[英]Finding the row indices of zero elements in csc_matrix

I have a csc_matrix like this: 我有一个csc_matrix像这样:

>>> arr_csc = arr.tocsc()
>>> arr_csc
<2x3 sparse matrix of type '<type 'numpy.int64'>'
    with 5 stored elements in Compressed Sparse Column format>
>>> arr_csc.todense()
matrix([[0, 1, 0],
        [3, 4, 0]])

Now, what I want is the row indices of all zero elements in each column. 现在,我想要的是每一列中所有零元素的行索引。 For example: 例如:

For column 0, I want "[0]"
For column 1, I want "[]"
For column 2. I want "[0, 1]"

What is the fastest way to do this? 最快的方法是什么?

Thanks! 谢谢!

How about something like this: 这样的事情怎么样:

The main idea is to use .indptr and .indices , the rest of my solution can probably be improved on. 主要思想是使用.indptr.indices ,我的解决方案的其余部分可能会得到改进。

from scipy import sparse
import numpy as np 

arr_csc = sparse.csc_matrix([[0, 1, 0],
                            [3, 4, 0]])

result = []

all_rows = np.arange(arr_csc.shape[0])

for i in xrange(len(arr_csc.indptr) - 1):
    start = arr_csc.indptr[i]
    end = arr_csc.indptr[i+1]

    result.append(np.setdiff1d(all_rows, arr_csc.indices[start : end]))

print result 

Result: 结果:

[array([0]), array([], dtype=int64), array([0, 1])]

With your sample, this works: 对于您的样本,这可以工作:

In [808]: arr=sparse.csc_matrix([[0,1,0],[3,4,0]])\    
In [809]: arr1=arr==0
In [810]: arr1.T.tolil().rows
Out[810]: array([[0], [], [0, 1]], dtype=object)

Beware that when you do arr==0 you will get a warning: 当心,当您执行arr==0您会得到警告:

/usr/lib/python3/dist-packages/scipy/sparse/compressed.py:220: SparseEfficiencyWarning: Comparing a sparse matrix with 0 using == is inefficient, try using != instead. /usr/lib/python3/dist-packages/scipy/sparse/compressed.py:220:SparseEfficiencyWarning:比较使用==的稀疏矩阵与0效率低下,请尝试使用!=代替。 ", try using != instead.", SparseEfficiencyWarning) “,请尝试使用!=代替。”,SparseEfficiencyWarning)

In your sample there are equal number of 0s and nonzeros. 在您的样本中,有相等数量的0和非零。 But in a typical sparse matrix, there are many more 0s. 但是在典型的稀疏矩阵中,还有更多的0。 Lots of 0s means long lists of rows. 大量的0表示较长的行列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM