简体   繁体   English

用数组中的元素乘以 Python 稀疏矩阵的行和列

[英]Multiplying Rows and Columns of Python Sparse Matrix by elements in an Array

I have a numpy array such as:我有一个 numpy 数组,例如:

array = [0.2, 0.3, 0.4]

(this vector is actually size 300k dense, I'm just illustrating with simple examples) (这个向量的大小实际上是 300k 密集,我只是用简单的例子来说明)

and a sparse symmetric matrix created using Scipy such as follows:以及使用 Scipy 创建的稀疏对称矩阵,如下所示:

M = [[0, 1, 2]  
     [1, 0, 1]  
     [2, 1, 0]]

(represented as dense just to illustrate; in my real problem it's a (300k x 300k) sparse matrix) (表示为密集只是为了说明;在我真正的问题中,它是一个 (300k x 300k) 稀疏矩阵)

Is it possible to multiply all rows by the elements in array and then make the same operation regarding the columns?是否可以将所有行乘以数组中的元素,然后对列进行相同的操作?

This would result first in :这将首先导致:

M = [[0 * 0.2, 1 * 0.2, 2 * 0.2]
     [1 * 0.3, 0 * 0.3, 1 * 0.3]
     [2 * 0.4, 1 * 0.4, 0 * 0.4]]

(rows are being multiplied by the elements in array) (行乘以数组中的元素)

M = [[0, 0.2, 0.4]
     [0.3, 0, 0.3]
     [0.8, 0.4, 0]]

And then the columns are multiplied:然后将列相乘:

M = [[0 * 0.2, 0.2 * 0.3, 0.4 * 0.4]
     [0.3 * 0.2, 0 * 0.3, 0.3 * 0.4]
     [0.8 * 0.2, 0.4 * 0.3, 0 * 0.4]]

Resulting finally in:最终导致:

M = [[0, 0.06, 0.16]
     [0.06, 0, 0.12]
     [0.16, 0.12, 0]]

I've tried applying the solution I found in this thread , but it didn't work;我尝试应用我在此线程中找到的解决方案,但没有奏效; I multiplied the data of the M by the elements in array as it was suggested, then transposed the matrix and applied the same operation but the result wasn't correct, still coudn't understand why!我按照建议将 M 的数据乘以数组中的元素,然后转置矩阵并应用相同的操作,但结果不正确,仍然不明白为什么!

Just to point this out, the matrix I'll be running this operations are somewhat big, it has 20 million non-zero elements so efficiency is very important!只是指出这一点,我将运行此操作的矩阵有点大,它有 2000 万个非零元素,因此效率非常重要!

I appreciate your help!我感谢您的帮助!

Edit:编辑:

Bitwise solution worked very well.按位解决方案效果很好。 Here it took 1.72 s to compute this operation but that's ok to our work.这里计算这个操作需要 1.72 秒,但这对我们的工作没问题。 Tnx!天啊!

In general you want to avoid loops and use matrix operations for speed and efficiency.通常,您希望避免循环并使用矩阵运算来提高速度和效率。 In this case the solution is simple linear algebra, or more specifically matrix multiplication.在这种情况下,解决方案是简单的线性代数,或者更具体地说是矩阵乘法。

To multiply the columns of M by the array A, multiply M*diag(A).要将 M 的列乘以数组 A,请乘以 M*diag(A)。 To multiply the rows of M by A, multiply diag(A)*M.要将 M 的行乘以 A,请乘以 diag(A)*M。 To do both: diag(A)*M*diag(A), which can be accomplished by:两者都做:diag(A)*M*diag(A),可以通过以下方式完成:

numpy.dot(numpy.dot(a, m), a)

diag(A) here is a matrix that is all zeros except having A on its diagonal. diag(A) 这里是一个矩阵,除了在对角线上有 A 外,全为零。 You can have methods to create this matrix easily (eg numpy.diag() and scipy.sparse.diags()).您可以使用方法轻松创建此矩阵(例如 numpy.diag() 和 scipy.sparse.diags())。

I expect this to run very fast.我希望这运行得非常快。

The following should work:以下应该工作:

[[x*array[i]*array[j] for j, x in enumerate(row)] for i, row in enumerate(M)]

Example:例子:

>>> array = [0.2, 0.3, 0.4]
>>> M = [[0, 1, 2], [1, 0, 1], [2, 1, 0]]
>>> [[x*array[i]*array[j] for j, x in enumerate(row)] for i, row in enumerate(M)]
[[0.0, 0.059999999999999998, 0.16000000000000003], [0.059999999999999998, 0.0, 0.12], [0.16000000000000003, 0.12, 0.0]]

Values are slightly off due to limitations on floating point arithmetic .由于浮点运算的限制,数值略有偏差 Use the decimal module if the rounding error is unacceptable.如果舍入误差不可接受,请使用小数模块。

I use this combination:我使用这种组合:

def multiply(matrix, vector, axis):
    if axis == 1:
        val = np.repeat(array, matrix.getnnz(axis=1))
        matrix.data *= val
    else:
        matrix = matrix.multiply(vector)
    return matrix

When the axis is 1 (multiply by rows), I replicate the second approach of this solution , and when the axis is 0 (multiply by columns) I use multiply当轴为 1(乘以行)时,我复制此解决方案的第二种方法,当轴为 0(乘以列)时,我使用乘法

The in-place result (axis=1) is more efficient.就地结果(轴 = 1)更有效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM