简体   繁体   中英

Fast slicing of numpy array multiple times

I have something like a np.arange([100000]) and i need to retrieve data between two indexes multiple times. Currently i running this which is slow

data = np.arange([100000])
# This array usually contains thousands of slices
slices = np.array( [
       [1, 4],
       [10,20],
       [100,110],
       [1000,1220]
])

# One way i have been doing it
np.take(data, [i for iin, iout in slices for idx in range(iin, iout)])
# The other way
[data[iin:iout] for iin, iout in slices]

Both ways are slow. I need this to be very fast. I looking for something like this.

data[slices[:,0], slices[:,1]]

Some timings with your slices and data = np.arange(2000)

Your take , corrected:

In [360]: timeit np.take(data, [idx for iin, iout in slices for idx in range(iin,iout)])
10000 loops, best of 3: 92.5 us per loop

In [359]: timeit data[[idx for iin, iout in slices for idx in range(iin,iout)]]
10000 loops, best of 3: 92.2 us per loop

Your 2nd version (corrected) - quite a bit better

In [361]: timeit np.concatenate([data[iin:iout] for iin,iout in slices])
100000 loops, best of 3: 15.8 us per loop

Using np.r_ to concatenate slices - not much of an improvement over your 1st.

In [362]: timeit data[np.r_[tuple([slice(i[0],i[1]) for i in slices])]]
10000 loops, best of 3: 79 us per loop
In [363]: timeit np.r_[tuple([slice(i[0],i[1]) for i in slices])]
10000 loops, best of 3: 67.5 us per loop

Constructing the index takes the bulk of the time.

Of course rankings at this size might change with a much scaled up problem.

Since your slices vary in length, there isn't much hope of generating them all in a vectorized way, that is 'in parallel'. I don't know if a cython implementation would speed it up much or not.

More timings from an earlier similar question https://stackoverflow.com/a/11062055/901925

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM