简体繁体 English

如何存储稀疏矩阵？

[英]How to store sparse matrix?

原文 2015-06-16 09:06:42 5 3 c++/ time-complexity/ sparse-matrix

I need to implement 2 types of storing sparse matrix in C++: 我需要在C ++中实现两种类型的存储稀疏矩阵：

Linked List 链接列表
Array (effective way) 数组（有效方式）

Space complexity is very important here. 空间复杂性在这里非常重要。 What are the most effective ways to do it? 有效的方法是什么？

3 个解决方案

nnz : non-zero number of sparse matrix nnz ：稀疏矩阵的非零数
row_size : matrix row number row_size ：矩阵行号
column_size : matrix column number column_size ：矩阵列号
There are many ways, their space complexity : 它们的空间复杂性有很多种：

Compressed Sparse Row (CSR) : 2*nnz + row_size number of memory 压缩稀疏行（CSR）： 2*nnz + row_size内存数
Compressed Sparse Column (CSC) : 2*nnz + column_size number of memory 压缩稀疏列（CSC）： 2*nnz + column_size内存数
Coordinate Format (COO) : 3*nnz number of memory 坐标格式（COO）： 3*nnz内存数

For space complexity : 对于空间复杂性：
If row_size > column_size , use CSC format, otherwise, use CSR format. 如果row_size > column_size ，则使用CSC格式，否则，使用CSR格式。

For Time complexity: 对于时间复杂性：
For CSR format, Row will be indexed by O(1) time, Column will be indexed by O(log(k)) time, by binary search the Column, k is the number of non-zero element of that row. 对于CSR格式，Row将被O(1)时间索引，Column将被O(log(k))时间索引，通过二进制搜索Column， k是该行的非零元素的数量。 So value will be indexed by O(log(k)) time. 因此，值将被O(log(k))时间索引。
For COO format, value will be indexed in O(1) time. 对于COO格式，值将在O(1)时间内编入索引。

Format details 格式细节
[1] https://en.wikipedia.org/wiki/Sparse_matrix [1] https://en.wikipedia.org/wiki/Sparse_matrix
[2] https://software.intel.com/en-us/node/471374 [2] https://software.intel.com/en-us/node/471374

An efficient way would be to use hash map (for each row) of hash maps (to store elements in each row by column index). 一种有效的方法是使用哈希映射（对于每一行）的哈希映射（通过列索引存储每行中的元素）。 Then would be able to access any element in O(1) time. 然后就可以在O（1）时间内访问任何元素。

You can implement all numeric algorithms, like addition and multiplication iterating only through non-zero elements which will give you better complexity then O(N * M) where N and M are number of columns and rows in a matrix. 您可以实现所有数值算法，例如仅通过非零元素进行加法和乘法迭代，这将提供比O（N * M）更好的复杂度，其中N和M是矩阵中的列数和行数。

Since the matrix is sparse, you only need to store the cells that are filled in. A simple lookup of coordinate to value should do. 由于矩阵是稀疏的，您只需要存储填充的单元格。对坐标值的简单查找应该可以。 Ideally you should use something with fast lookup like a map O(log n) or a unordered_map O(1). 理想情况下，您应该使用快速查找的内容，如地图O（log n）或unordered_map O（1）。