简体   繁体   English

r 中的稀疏矩阵大小与常规矩阵大小

[英]Sparse Matrix size vs Regular Matrix size in r

My regular matrix has object size 416 bytes, and when I use as(, "sparseMatrix") to turn it into sparse matrix, then the size for this sparse matrix goes up to 1720 bytes.我的常规矩阵有 object 大小 416 字节,当我使用as(, "sparseMatrix")将它变成稀疏矩阵时,这个稀疏矩阵的大小增加到 1720 字节。

Is it normal?正常吗? Shouldn't we expect a smaller storage size for the sparse matrix than the regular one?我们不应该期望稀疏矩阵的存储大小比常规矩阵小吗?

Many thanks in advance!提前谢谢了!

matrix is one of the base data structures of R, and can be stored with very little metadata: it is a sequence of values with just a length for each dimension, and a data type. matrix是 R 的基本数据结构之一,可以用很少的元数据存储:它是一个值序列,每个维度只有一个长度和一个数据类型。

A sparseMatrix object however contains more metadata, as you'll see with str() in the examples below.然而, sparseMatrix object 包含更多元数据,您将在下面的示例中看到str() Most prominently, for each non-zero value an (x,y) position is stored in addition to the value itself.最突出的是,对于每个非零值,除了值本身之外,还存储了 (x,y) position。 This alone will cause a threefold increase in memory use, if you're storing integers.如果您要存储整数,仅此一项就会导致 memory 的使用量增加三倍。 This is only compensated when there are many zero values, as they are not stored at all.这只有在有很多零值时才会得到补偿,因为它们根本没有被存储。

Dense example密集的例子

Compare for a matrix with no zero values:比较没有零值的矩阵:

> mat1 = matrix( sample(3*3), c(3, 3))
> smat1 <- as(mat1, "sparseMatrix")

> showMem(c('mat1', 'smat1'), bytes=T)
        size bytes
mat1   264 B   264
smat1 1.7 kB  1688

> mat1
     [,1] [,2] [,3]
[1,]    2    5    7
[2,]    8    6    1
[3,]    3    4    9

> str(mat1)
 int [1:3, 1:3] 2 8 3 5 6 4 7 1 9

> str(smat1)
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  ..@ i       : int [1:9] 0 1 2 0 1 2 0 1 2
  ..@ p       : int [1:4] 0 3 6 9
  ..@ Dim     : int [1:2] 3 3
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : NULL
  ..@ x       : num [1:9] 2 8 3 5 6 4 7 1 9
  ..@ factors : list()

Or a larger version of such a matrix:或此类矩阵的更大版本:

> mat2 = matrix( sample(1000*1000), c(1000, 1000))
> smat2 <- as(mat2, "sparseMatrix")

> showMem(c('mat2', 'smat2'), bytes=T)
       size    bytes
mat2   4 MB  4000216
smat2 12 MB 12005504

Sparse example稀疏的例子

Here we create a more sparse matrix, with 6 zeroes and only 3 values.这里我们创建了一个更稀疏的矩阵,有 6 个零,只有 3 个值。 We can see that the sparseMatrix only stores the 3 values.我们可以看到 sparseMatrix 只存储了 3 个值。

> mat3 = matrix( sample(3*3)%%3%%2, c(3, 3))
> smat3 <- as(mat3, "sparseMatrix")

> showMem(c('mat3', 'smat3'), bytes=T)
        size bytes
mat3   344 B   344
smat3 1.6 kB  1560

> mat3
     [,1] [,2] [,3]
[1,]    0    1    0
[2,]    0    0    0
[3,]    1    0    1

> str(mat3)
 num [1:3, 1:3] 0 0 1 1 0 0 0 0 1

> str(smat3)
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  ..@ i       : int [1:3] 2 0 2
  ..@ p       : int [1:4] 0 1 2 3
  ..@ Dim     : int [1:2] 3 3
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : NULL
  ..@ x       : num [1:3] 1 1 1
  ..@ factors : list()

And finally a case where the sparseMatrix gives the expected memory savings:最后是 sparseMatrix 给出预期的 memory 节省的情况:

> mat4 = matrix( sample(1000*1000)%%3%%2, c(1000, 1000))

> smat4 <- as(mat4, "sparseMatrix")

> table(mat4)
mat4
     0      1 
666666 333334 

> showMem(c('mat4', 'smat4'), bytes=T)
      size   bytes
mat4  8 MB 8000216
smat4 4 MB 4005512

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM