Multiplying a dense numpy matrix with a sparse scipy vector using @
is incredibly inefficient . It seems like it doesn't make use of the sparsity of the vector at all.
Say we have
A = np.eye(5000)
x = np.eye(5000)[333]
x = scipy.sparse.coo_matrix(x).T # make it a sparse vector
Then multiplication using @
yields:
%timeit A @ x
8 ms ± 78.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Let's write an incredibly crappy sparse multiplication ourselves:
def mult_dense_with_sparse(A, x):
return (A[:,x.nonzero()[0]] @ x.toarray()[x.nonzero()[0]]).T[0]
Lo and behold:
%timeit mult_dense_with_sparse(A, x)
50.3 µs ± 45.3 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
It's much faster! Even though this implementation first creates a dense vector by adding all the zeros again, and then removes all the zeros again... So I'm wondering, if not @
, how can I multiply a dense numpy matrix with a sparse scipy vector efficiently? Surely such a basic operation is part of scipy?
Edit: provided solution in other question is not helpful as it is just as inefficient as @
:
%timeit scipy.sparse.csr_matrix.dot(A, x)
7.97 ms ± 113 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Edit 2 in responaw Hameer Abbasi:
Line # Hits Time Per Hit % Time Line Contents
==============================================================
101 @profile
102 def ratio(self, n):
103 80 51.0 0.6 0.0 s = n + 1
104 80 11401854.0 142523.2 16.1 self.sfpc = self.scolPCA(s) # sparse first principal component
106 80 351898.0 4398.7 0.5 wSums = (self.signals[:,self.sfpc.nonzero()[0]] @ self.sfpc.toarray()[self.sfpc.nonzero()[0]]).T[0]
108 80 56487433.0 706092.9 79.7 wSums = self.sfpc.T.dot(self.signals.T)[0]
110 80 2521189.0 31514.9 3.6 self.Flag, self.threshold, self.incline, self.deltaVar = self.actFunctOpt(list(wSums))
111 80 160.0 2.0 0.0 return self.deltaVar / (2 + s)
Here you can see that our 'hack' takes about 0.5% of the time of this function, while using dot
takes 79.7% of the time in this function.
In your example, A
is of type np.ndarray
, and x
is of type scipy.sparse.coo_matrix
.
If you're looking for the simplest answer possible to speed this up, here it is:
import numpy as np
import scipy.sparse as sps
A = np.random.rand(5000, 5000)
v = np.zeros((5000, 1))
v[333] = 1
v_s = sps.csc_matrix(v)
%timeit v_s.T.dot(A).T # 46.6 µs ± 1.11 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit A @ v # 6.75 ms ± 29.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
However, if you want to get the mechanics behind this answer: Currently, the operator @
doesn't support overloading for a NumPy array due to a bug . Also, scipy.sparse.coo_matrix
doesn't even try to override @
, and matrix multiplication is faster for scipy.sparse.csr_matrix
( coo_matrix
will first be converted to csr_matrix
for multiplication anyway).
So what happens is NumPy converts your sparse vector into a NumPy array, then performs dense multiplication.
However, csr_matrix.dot
exists and supports multiplication with dense NumPy arrays. So we use the property A @ B = (BT @ AT).T
, along with the fact that csc_matrix.T
produces csr_matrix
to produce the above optimised code.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.