[英]List non zero elements from sparse matrix in python
How to list, in a simple and one line code (and fast!), all non zero elements of a csr_matrix
? 如何以简单的单行代码(和快速!)
csr_matrix
所有非零元素?
I'm using this code: 我正在使用此代码:
edges_list = list([tuple(row) for row in np.transpose(A.nonzero())])
weight_list = [A[e] for e in edges_list]
but it is taking quite a long time to execute. 但执行需要相当长的时间。
For a CSR matrix in canonical form, access the data array directly: 对于规范形式的CSR矩阵,直接访问数据数组:
A.data
but be aware that matrices not in canonical form may include explicit zeros or duplicate entries in their representation, which will need special handling. 但请注意,不是规范形式的矩阵可能在其表示中包含明确的零或重复条目,这将需要特殊处理。 For example,
例如,
# Merge duplicates and remove explicit zeros. Both operations modify A.
# We sum duplicates first because they might sum to zero - for example,
# if a 5 and a -5 are in the same spot, we have to sum them to 0 and then remove the 0.
A.sum_duplicates()
A.eliminate_zeros()
# Now use A.data
do_whatever_with(A.data)
You can use A.nonzero()
to index into A
directly: 您可以使用
A.nonzero()
直接索引到A
:
In [19]: A = np.random.randint(0, 3, (3, 3))
In [20]: A
Out[20]:
array([[2, 1, 1],
[1, 2, 2],
[0, 1, 0]])
In [21]: A[A.nonzero()]
Out[21]: array([2, 1, 1, 1, 2, 2, 1])
The result is the same as with your approach: 结果与您的方法相同:
In [22]: edges_list = list([tuple(row) for row in np.transpose(A.nonzero())])
In [23]: [A[e] for e in edges_list]
Out[23]: [2, 1, 1, 1, 2, 2, 1]
And obviously quite a bit faster (and more so if the matrix gets bigger): 而且显然要快得多(如果矩阵变大则更多):
In [25]: %timeit [A[e] for e in list([tuple(row) for row in np.transpose(A.nonzero())])]
10000 loops, best of 3: 48 µs per loop
In [26]: %timeit A[A.nonzero()]
100000 loops, best of 3: 10.7 µs per loop
Also works with scipy
csr_matrix
, although there are better methods for those, as shown in other answers: 也适用于
scipy
csr_matrix
,尽管有更好的方法,如其他答案所示:
In [30]: M = scipy.sparse.csr_matrix(A)
In [31]: M[M.nonzero()]
Out[31]: matrix([[2, 1, 1, 1, 2, 2, 1]], dtype=int32)
Just use A.data
只需使用
A.data
In [16]: from scipy.sparse import csr_matrix
In [17]: A = csr_matrix([[1,0,0],[0,2,0]])
In [18]: A.data
Out[18]: array([1, 2])
If the sparse matrix has been modified or to be safe, you should use: A.eliminate_zeros()
如果稀疏矩阵已被修改或是安全的,您应该使用:
A.eliminate_zeros()
In [19]: A[0,0] = 0
In [20]: A.data
Out[20]: array([0, 2])
In [21]: A.eliminate_zeros()
In [22]: A.data
Out[22]: array([2])
You could use scipy.sparse.find
like this: 您可以像这样使用
scipy.sparse.find
:
>>> from scipy.sparse import csr_matrix, find
>>> A = csr_matrix([[7.0, 8.0, 0],[0, 0, 9.0]])
>>> find(A)
(array([0, 0, 1], dtype=int32), array([0, 1, 2],
dtype=int32), array([ 7., 8., 9.]))
https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.find.html https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.find.html
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.