简体   繁体   中英

Bottleneck on indexing performance with NumPy arrays (or creation of tuples)

Imagine we have the following function:

def return_slice(self, k):
    return (self.A[self.C[k]:self.C[k+1]], self.B[self.C[k]:self.C[k+1]])

which is part of a class with arrays A, B and C which hold a ton of integers (upwards of 10^5). While calling this function a few times is fast enough, I've been noticing that ~2 million calls to this function are taking a very long time (my last few experiences have been showing ~12 seconds). I've managed to do a bit better with this:

def return_slice(self, k):
    pos = slice(self.C[k], self.C[k + 1])
    return (self.A[pos], self.B[pos])

which brings that down to ~6 seconds. This is still a bit unacceptable to me... I feel like I should change the whole way my arrays are structured, but I'm bringing this question to you because there may be something that I'm missing on why this is so slow.

Bear in mind that a "structure" to the values of k cannot be assumed, just assume that it is random for every execution.

I also think the creation of the tuple before returning may be the problem here, but it will take a ton of work to remove that -- I'd prefer to explore other alternatives.

Edit: A and B have the same size, but not the same data type.

How about?

 self.D = np.vstack(self.A, self.B)

 def return_slice(self, k):
     pos = slice(self.C[k], self.C[k + 1])
     return tuple(self.D[:, pos])

Let's time some slicing variations:

In [447]: A = np.ones(10000)                                                                           
In [448]: timeit A[24:500]                                                                             

285 ns ± 0.807 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [449]:                                                                                              
In [449]: C = np.array([24, 500])                                                                      
In [450]: timeit A[C[0]:C[1]]                                                                          
632 ns ± 3.99 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [451]: def foo(k): 
     ...:     pos = slice(C[k],C[k+1]) 
     ...:     return A[pos] 
     ...:                                                                                              
In [452]: timeit foo(0)                                                                                
989 ns ± 4.33 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [453]: def foo(k): 
     ...:     pos = slice(C[k],C[k+1]) 
     ...:     return A[pos], A[pos] 
     ...:      
     ...:                                                                                              
In [454]: timeit foo(0)                                                                                
1.31 µs ± 30 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

So calling foo 2 million times will take more than 2 seconds.

Usually when I do timing test, a time under a microsecond looks good, unless it's a really trivial operation. The key to speed in numpy is to reduce the number of calls, more so than speeding up individual ones. "vectorization" tries to eliminate many calls/iterations by using whole-array operations - one call using compiled numpy methods. That can give 10x or better times.

numba can move us in the compiled direction. With just as simple application:

In [456]: @numba.njit 
     ...: def foo(k): 
     ...:     pos = slice(C[k],C[k+1]) 
     ...:     return A[pos], A[pos] 

In [459]: timeit foo(0)                                                                                
555 ns ± 4.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

and collecting a million such calls:

In [473]: timeit [foo(0) for _ in range(1000000)]                                                      
1.09 s ± 38.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM