简体   繁体   English

带广播的稀疏scipy矩阵向量的元素加法

[英]Elementwise addition of sparse scipy matrix vector with broadcasting

I'm trying to figure out how to best perform elementwise addition (and subtraction) of a sparse matrix and a sparse vector. 我试图弄清楚如何最好地执行稀疏矩阵和稀疏向量的元素加法(和减法)。 I found this trick on SO: 我在SO上找到了这个技巧

mat = sp.csc_matrix([[1,0,0],[0,1,0],[0,0,1]])
vec = sp.csr_matrix([[1,2,1]])
mat.data += np.repeat(vec.toarray()[0], np.diff(mat.indptr))

But unfortunately it only updates non-zero values: 但不幸的是它只更新了非零值:

print(mat.todense())
[[2 0 0]
 [0 3 0]
 [0 0 2]]

The actual accepted answer on the SO thread: SO线程上实际接受的答案:

def sum(X,v):
    rows, cols = X.shape
    row_start_stop = as_strided(X.indptr, shape=(rows, 2),
                            strides=2*X.indptr.strides)
    for row, (start, stop) in enumerate(row_start_stop):
        data = X.data[start:stop]
        data -= v[row]

sum(mat,vec.A[0])

Does the same thing. 做同样的事情。 I'm unfortunately out of ideas by now, so I was hoping you could help me figuring out the best way to solve this. 不幸的是,我现在已经没有想法了,所以我希望你能帮助我找出解决这个问题的最佳方法。

EDIT: I'm expecting it to do the same as a dense version of this would do: 编辑:我希望它能像密集版本一样做:

np.eye(3) + np.asarray([[1,2,1]])
array([[ 2.,  2.,  1.],
       [ 1.,  3.,  1.],
       [ 1.,  2.,  2.]])

Thanks 谢谢

Some tests with 10x10 sparse mat and vec: 一些使用10x10稀疏垫和vec的测试:

In [375]: mat=sparse.rand(10,10,.1) 
In [376]: mat
Out[376]: 
<10x10 sparse matrix of type '<class 'numpy.float64'>'
    with 10 stored elements in COOrdinate format>

In [377]: mat.A
Out[377]: 
array([[ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.15568621,  0.59916335,  0.        ,  0.        ,  0.        ],
       ...
       [ 0.        ,  0.        ,  0.15552687,  0.        ,  0.        ,
         0.47483064,  0.        ,  0.        ,  0.        ,  0.        ]])

In [378]: vec=sparse.coo_matrix([0,1,0,2,0,0,0,3,0,0]).tocsr()
<1x10 sparse matrix of type '<class 'numpy.int32'>'
    with 3 stored elements in Compressed Sparse Row format>

maxymoo's solution: maxymoo的解决方案:

def addvec(mat,vec):
    Mc = mat.tocsc()
    for i in vec.nonzero()[1]:
        Mc[:,i]=sparse.csc_matrix(Mc[:,i].todense()+vec[0,i])
    return Mc    

And variation that uses lil format, which is supposed to be more efficient when changing the sparsity structure: 并且使用lil格式的变体,在更改稀疏结构时应该更有效:

def addvec2(mat,vec):
    Ml=mat.tolil()
    vec=vec.tocoo()                                            
    for i,v in zip(vec.col, vec.data):
        Ml[:,i]=sparse.coo_matrix(Ml[:,i].A+v)
    return Ml

The sumation has 38 nonzero terms, up from 10 in the original mat . 该sumation有38个非零术语,高于原mat 10个。 It adds the 3 columns from vec . 它添加了vec的3列。 That's a big change in sparsity. 这是稀疏性的一个重大变化。

In [382]: addvec(mat,vec)
Out[382]: 
<10x10 sparse matrix of type '<class 'numpy.float64'>'
    with 38 stored elements in Compressed Sparse Column format>

In [383]: _.A
Out[383]: 
array([[ 0.        ,  1.        ,  0.        ,  2.        ,  0.        ,
         0.        ,  0.        ,  3.        ,  0.        ,  0.        ],
       [ 0.        ,  1.        ,  0.        ,  2.        ,  0.        ,
         0.15568621,  0.59916335,  3.        ,  0.        ,  0.        ],
       ...
       [ 0.        ,  1.        ,  0.15552687,  2.        ,  0.        ,
         0.47483064,  0.        ,  3.        ,  0.        ,  0.        ]])

Same output with addvec2: 与addvec2相同的输出:

In [384]: addvec2(mat,vec)
Out[384]: 
<10x10 sparse matrix of type '<class 'numpy.float64'>'
    with 38 stored elements in LInked List format>

And in timing, addvec2 does better than 2x 在时间上, addvec2表现优于2x

In [385]: timeit addvec(mat,vec)
100 loops, best of 3: 6.51 ms per loop

In [386]: timeit addvec2(mat,vec)
100 loops, best of 3: 2.54 ms per loop

and the dense equivalents: 和密集的等价物:

In [388]: sparse.coo_matrix(mat+vec.A)
Out[388]: 
<10x10 sparse matrix of type '<class 'numpy.float64'>'
    with 38 stored elements in COOrdinate format>

In [389]: timeit sparse.coo_matrix(mat+vec.A)
1000 loops, best of 3: 716 µs per loop

In [390]: timeit sparse.coo_matrix(mat.A+vec.A)
1000 loops, best of 3: 338 µs per loop

A version that might save on temporary dense matrix space, runs in the same time: 可能在临时密集矩阵空间上保存的版本在同一时间运行:

In [393]: timeit temp=mat.A; temp+=vec.A; sparse.coo_matrix(temp)
1000 loops, best of 3: 334 µs per loop

So the dense version does 5-7x better than my sparse version. 所以密集版本比我的稀疏版本好5-7倍。

For a really large mat , memory issues might chew into the dense performance, but the iterative sparse solution(s) isn't going to shine either. 对于一个非常大的mat ,内存问题可能会影响密集性能,但迭代稀疏解决方案也不会发光。

I may be able to squeeze more performance from addvec2 by indexing Ml more efficiently. 我可以通过更有效地索引Mladdvec2挤出更多性能。 Ml.data[3],Ml.rows[3] is considerably faster than Ml[3,:] or Ml[:,3] . Ml.data[3],Ml.rows[3]Ml[3,:]Ml[:,3]快得多。

def addvec3(mat,vec):
    Mtl=mat.T.tolil()
    vec=vec.tocoo()
    n = mat.shape[0]
    for i,v in zip(vec.col, vec.data):
        t = np.zeros((n,))+v
        t[Mtl.rows[i]] += Mtl.data[i]
        t = sparse.coo_matrix(t)
        Mtl.rows[i] = t.col
        Mtl.data[i] = t.data
    return Mtl.T

In [468]: timeit addvec3(mat,vec)
1000 loops, best of 3: 1.8 ms per loop

A modest improvement, but not as much as I'd hoped. 适度的改进,但没有我希望的那么多。 And squeezing a bit more: 再挤一点:

def addvec3(mat,vec):
    Mtl = mat.T.tolil()
    vec = vec.tocoo(); 
    t0 = np.zeros((mat.shape[0],))
    r0 = np.arange(mat.shape[0])
    for i,v in zip(vec.col, vec.data):
        t = t0+v
        t[Mtl.rows[i]] += Mtl.data[i]
        Mtl.rows[i] = r0
        Mtl.data[i] = t
    return Mtl.T

In [531]: timeit mm=addvec3(mat,vec)
1000 loops, best of 3: 1.37 ms per loop

So your original matrix is sparse, the vector is sparse but in the resulting matrix the columns corresponding to nonzero coordinates in your vector will be dense. 因此,您的原始矩阵是稀疏的,向量是稀疏的,但在结果矩阵中,对应于向量中非零坐标的列将是密集的。

So we may as well materialise those columns as dense matrices 因此,我们也可以将这些列实现为密集矩阵

def addvec(mat,vec):
   for i in vec.nonzero()[1]:
      mat[:,i] = sp.csc_matrix(mat[:,i].todense() + vec[0,i])
   return mat

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM