简体   繁体   English

从行列表创建稀疏矩阵(稀疏向量)

[英]Create a sparse matrix from a list of rows (sparse vectors)

I would like to efficiently create the following sparse matrix of dimension (s, n1+n2) : 我想有效地创建以下尺寸为(s, n1+n2)稀疏矩阵:

v0 v1 
v0 v2 
v0 v3 
 ... 
v0 vs

given sparse vector v0 (1, n1) and a list of sparse vectors (1, n2) l = [v1, ... , vs] . 给定稀疏向量v0 (1, n1)和稀疏向量列表(1, n2) l = [v1, ... , vs]

I have tried to use coo_matrix() but it was unsuccessful as it seems to only work if you have dense vectors: 我尝试使用coo_matrix()但未成功,因为它似乎仅在具有密集向量的情况下才有效:

left = coo_matrix(np.repeat(v0, s))
right = coo_matrix(l)
m = hstack((left, right))

Edit 1: 编辑1:

I have found a workaround that does not seem very efficient: 我发现了一种似乎不太有效的解决方法:

right = vstack([x for x in l])
left = vstack([v0 for i in range(len(l))])
m = hstack((left, right))

Edit 2: 编辑2:

This is an example (not working) to help you understand the situation. 这是一个示例(无效),可以帮助您了解情况。

from scipy.sparse import random, coo_matrix
from numpy import repeat

s = 10
n1 = 3
n2 = 5

v0 = random(1, n1)
l = [random(1, n2) for i in range(s)]

left = coo_matrix(repeat(v0, s))
right = coo_matrix(l)
m = hstack((left, right))
In [1]: from scipy import sparse

In [2]: s, n1, n2 = 10,3,5
In [3]: v0 = sparse.random(1, n1)
In [4]: v0
Out[4]: 
<1x3 sparse matrix of type '<class 'numpy.float64'>'
    with 0 stored elements in COOrdinate format>
In [5]: l = [sparse.random(1, n2) for i in range(s)]
In [6]: l
Out[6]: 
[<1x5 sparse matrix of type '<class 'numpy.float64'>'
    with 0 stored elements in COOrdinate format>,
  ...
 <1x5 sparse matrix of type '<class 'numpy.float64'>'
    with 0 stored elements in COOrdinate format>]

Instead of np.repeat use sparse.vstack to create a stack of V0 copies 代替np.repeat使用sparse.vstack创建V0副本堆栈

In [7]: V0 = sparse.vstack([v0]*s)
In [8]: V0
Out[8]: 
<10x3 sparse matrix of type '<class 'numpy.float64'>'
    with 0 stored elements in COOrdinate format>

Similarly convert the list of n2 matrices into one matrix: 类似地,将n2矩阵的列表转换为一个矩阵:

In [10]: V1 = sparse.vstack(l)
In [11]: V1
Out[11]: 
<10x5 sparse matrix of type '<class 'numpy.float64'>'
    with 0 stored elements in COOrdinate format>

Now join them: 现在加入他们:

In [12]: m = sparse.hstack((V0,V1))
In [13]: m
Out[13]: 
<10x8 sparse matrix of type '<class 'numpy.float64'>'
    with 0 stored elements in COOrdinate format>

I won't make any claims about this being efficient. 我不会对此有任何主张。 hstack and vstack use bmat (check their code). hstackvstack使用bmat (检查其代码)。 bmat collects the coo attributes of all the blocks, and joins them (with offsets) into the inputs to a new coo_matrix call (again, the code is readable). bmat收集所有块的coo属性,并将它们(带有偏移量)连接到新的coo_matrix调用的输入中(同样,代码是可读的)。 So you could avoid some intermediate conversions by using bmat directly, or even playing with the coo attributes directly. 因此,您可以通过直接使用bmat甚至直接使用coo属性来避免某些中间转换。 But hstack and vstack are relatively intuitive. 但是hstackvstack相对直观。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM