简体   繁体   English

Scipy CSR稀疏矩阵实际上是COO吗?

[英]Scipy CSR sparse matrix is actually COO?

I've been recently dealing with sparse matrices. 我最近一直在处理稀疏矩阵。 My aim is to somehow convert an adjacency list for a graph into the CSR format, defined here: http://devblogs.nvidia.com/parallelforall/wp-content/uploads/2014/07/CSR.png . 我的目标是以某种方式将图的邻接表转换为CSR格式,在此定义: http : //devblogs.nvidia.com/parallelforall/wp-content/uploads/2014/07/CSR.png

One possible option I see, is that I simply first construct a NumPy matrix and convert it using scipy.sparse.csr_matrix . 我看到的一个可能的选择是,我首先简单地构造一个NumPy矩阵,然后使用scipy.sparse.csr_matrix对其进行scipy.sparse.csr_matrix The problem is, that the CSR in SciPy is somewhat different to the one discussed in the link. 问题在于,SciPy中的CSR与链接中讨论的CSR有所不同。 My question is, is this just a discrepancy, and I need to write my own parser, or can SciPy in fact convert into CSR defined in the link. 我的问题是,这仅仅是一个差异,我需要编写自己的解析器,还是可以将SciPy转换为链接中定义的CSR。

A bit more about the problem, let's say I have a matrix: 关于这个问题,我有一个矩阵:

matrix([[1, 1, 0],
        [0, 0, 1],
        [1, 0, 1]])

CSR format for this consists of two arrays, Column(C) and row(R). 为此的CSR格式由两个数组Column(C)和row(R)组成。 And i strive for looks like: 我努力看起来像:

C: [0,1,2,0,2]

R: [0,2,3,5]

SciPy returns the: SciPy返回:

  (0, 0)    1
  (0, 1)    1
  (1, 2)    1
  (2, 0)    1
  (2, 2)    1

where second column is the same as my C, yet this is to my understanding the COO format, not the CSR. 第二列与C相同,但据我了解,这是COO格式,而不是CSR。 (this was done using csr_matrix(adjacency_matrix) function). (这是使用csr_matrix(adjacency_matrix)函数完成的)。

There is a difference in what is stored internally and what you see when you simply print the matrix via print(A) (where A is a csr_matrix ). 内部存储的内容与仅通过print(A)打印矩阵(其中Acsr_matrix )所csr_matrix

In the documentation the attributes are listed. 文档中列出了属性。 Among others there are the following three attributes: 其中包括以下三个属性:

data CSR format data array of the matrix 数据CSR格式矩阵数据数组
indices CSR format index array of the matrix 索引CSR格式矩阵的索引数组
indptr CSR format index pointer array of the matrix 矩阵的indptr CSR格式索引指针数组

You can access (and manipulate) them through A.data , A.indices and A.indptr . 您可以通过A.dataA.indicesA.indptr访问(和操作)它们。

Bottom line: The CSR format in scipy is a "real" CSR format and you do not need to write your own parser (as long as you don't care about the in your case unnecessary data array). 底线:scipy中的CSR格式是“真实的” CSR格式,您不需要编写自己的解析器(只要您不关心这种情况下不必要的data数组)。
Also note: A matrix in CSR format is always represented by three arrays, not two. 另请注意: CSR格式的矩阵始终由三个数组表示,而不是两个。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM