简体   繁体   English

将稀疏数组中的元素与矩阵中的行相乘

[英]Multiplying elements in a sparse array with rows in matrix

If you have a sparse matrix X:如果您有一个稀疏矩阵 X:

>> X = csr_matrix([[0,2,0,2],[0,2,0,1]])
>> print type(X)    
>> print X.todense()    
<class 'scipy.sparse.csr.csr_matrix'>
[[0 2 0 2]
 [0 2 0 1]]

And a matrix Y:和一个矩阵 Y:

>> print type(Y)
>> print text_scores
<class 'numpy.matrixlib.defmatrix.matrix'>
[[8]
 [5]]

...How can you multiply each element of X by the rows of Y. For example: ...如何将 X 的每个元素乘以 Y 的行。例如:

[[0*8 2*8 0*8 2*8]
 [0*5 2*5 0*5 1*5]]

or:要么:

[[0 16 0 16]
 [0 10 0 5]]

I've tired this but obviously it doesn't work as the dimensions dont match: Z = X.data * Y我已经厌倦了,但显然它不起作用,因为尺寸不匹配: Z = X.data * Y

Unfortunatly the .multiply method of the CSR matrix seems to densify the matrix if the other one is dense.不幸的是,如果另一个矩阵是密集的,CSR 矩阵的.multiply方法似乎会使矩阵变得密集。 So this would be one way avoiding that:所以这将是避免这种情况的一种方法:

# Assuming that Y is 1D, might need to do Y = Y.A.ravel() or such...

# just to make the point that this works only with CSR:
if not isinstance(X, scipy.sparse.csr_matrix):
    raise ValueError('Matrix must be CSR.')

Z = X.copy()
# simply repeat each value in Y by the number of nnz elements in each row: 
Z.data *= Y.repeat(np.diff(Z.indptr))

This does create some temporaries, but at least its fully vectorized, and it does not densify the sparse matrix.这确实会创建一些临时对象,但至少它是完全矢量化的,并且不会使稀疏矩阵变密。


For a COO matrix the equivalent is:对于 COO 矩阵,等效项为:

Z.data *= Y[Z.row] # you can use np.take which is faster then indexing.

For a CSC matrix the equivalent would be:对于 CSC 矩阵,等效项为:

Z.data *= Y[Z.indices]

Something I use to perform row-wise (resp. column-wise) multiplication is to use matrix multiplication with a diagonal matrix on the left (resp. on the right):我用来执行行(或列)乘法的方法是使用矩阵乘法和左侧的对角矩阵(分别为右侧):

import numpy as np
import scipy.sparse as sp

X = sp.csr_matrix([[0,2,0,2],
                   [0,2,0,1]])
Y = np.array([8, 5])

D = sp.diags(Y) # produces a diagonal matrix which entries are the values of Y
Z = D.dot(X) # performs D @ X, multiplication on the left for row-wise action

Sparsity is preserved (in CSR format):保留稀疏性(以 CSR 格式):

print(type(Z))
>>> <class 'scipy.sparse.csr.csr_matrix'>

And the output is also correct:输出也是正确的:

print(Z.toarray()) # Z is still sparse and gives the right output
>>> print(Z.toarray()) # Z is still sparse and gives the right output
[[ 0. 16.  0. 16.]
 [ 0. 10.  0.  5.]]

I had same problem.我有同样的问题。 Personally I didn't find the documentation of scipy.sparse very helpful, neither found function that handles it directly.我个人没有发现scipy.sparse的文档很有帮助,也没有找到直接处理它的函数。 So I tried to write it myself and this solved for me:所以我试着自己写,这为我解决了:

Z = X.copy()
for row_y_idx in range(Y.shape[0]):
    Z.data[Z.indptr[row_y_idx]:Z.indptr[row_y_idx+1]] *= Y[row_y_idx, 0]

The idea is: for each element of Y in position row_y_idx -th, perform a scalar multiplication with the row_y_idx -th row of X .这个想法是:对于位置row_y_idx -th 中Y每个元素,执行与Xrow_y_idx -th 行的标量乘法。 More info about accessing elements in CSR matrices here (where data is A , IA is indptr ).有关在此处访问 CSR 矩阵中的元素的更多信息(其中dataAIAindptr )。

Given X and Y as you defined:鉴于您定义的XY

import numpy as np
import scipy.sparse as sps

X = sps.csr_matrix([[0,2,0,2],[0,2,0,1]])
Y = np.matrix([[8], [5]])

Z = X.copy()
for row_y_idx in range(Y.shape[0]):
    Z.data[Z.indptr[row_y_idx]:Z.indptr[row_y_idx+1]] *= Y[row_y_idx, 0]

print(type(Z))
print(Z.todense())

The output is the same as yours:输出与您的相同:

<class 'scipy.sparse.csr.csr_matrix'>
 [[ 0 16  0 16]
  [ 0 10  0  5]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM