简体   繁体   English

存储稀疏 Numpy 数组

[英]Storing a Sparse Numpy Array

I have a 20,000 x 20,000 Numpy matrix that I wish to store by file, where the average volumn only has 12 values in it.我有一个 20,000 x 20,000 Numpy 矩阵,我希望按文件存储,其中平均体积只有 12 个值。

What would be the most efficient way to store only the values in the format of仅以以下格式存储值的最有效方法是什么

if array[i][j] == 1:
   file.write("{} {} {{}}\n".format(i, j)

where (i, j) are the indices for the array?其中 (i, j) 是数组的索引?

You can use scipy to create sparse matrices from dense numpy arrays that only store values with nonzero entries against their indices.您可以使用scipy从密集的 numpy arrays 创建稀疏矩阵,这些矩阵仅存储具有针对其索引的非零条目的值。

import scipy
import pickle

I = np.eye(10000)  #Had 10000 nonzero values along diagonal
S = scipy.sparse.csr_matrix(I)
S
<10000x10000 sparse matrix of type '<class 'numpy.float64'>'
    with 10000 stored elements in Compressed Sparse Row format>

This is highly memory efficient and you can use pickle to dump / load this sparse matrix when you need it.这是非常高效的 memory ,您可以在需要时使用pickle转储/加载此稀疏矩阵。

#Pickle dump
file = open("S.pickle",'wb') #160kb
pickle.dump(S, file)

#Pickle load
file = open("S.pickle",'rb') 
S = pickle.load(file)

To get back a dense representation you can simply use .toarray() to get back a NumPy array or .todense() to get back a matrix type object.要取回密集表示,您可以简单地使用.toarray()取回 NumPy 数组或使用.todense()取回矩阵类型 object。

S.toarray()
array([[1., 0., 0., ..., 0., 0., 0.],
       [0., 1., 0., ..., 0., 0., 0.],
       [0., 0., 1., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 1., 0., 0.],
       [0., 0., 0., ..., 0., 1., 0.],
       [0., 0., 0., ..., 0., 0., 1.]])

For those reading after the fact: @hpaulj's comment of using "np.nonzero" effectively solves the problem!对于事后阅读的人:@hpaulj 关于使用“np.nonzero”的评论有效地解决了问题!

Edit: Here is the code I used to solve it!编辑:这是我用来解决它的代码!

array1, array2 = np.nonzero(array)
        for i in range(0, array1.size):
            file.write("{} {} {{}}\n".format(array1[i], array2[i]))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM