简体   繁体   中英

Is there a faster way than looping over numpy arrays?

If I have two numpy arrays of values; how can I quickly make a third array that gives me the number of times I have the same two values in the first two arrays?

For example:

x = np.round(np.random.random(2500),2)
xIndex = np.linspace(0, 1, 100)

y = np.round(np.random.random(2500)*10,2)
yIndex = np.linspace(0, 10, 1000)

z = np.zeros((100,1000))

Right now, I'm doing the following loop (which is prohibitively slow):

for m in x:
    for n in y:
        q = np.where(xIndex == m)[0][0]
        l = np.where(yIndex == n)[0][0]
        z[q][l] += 1

Then I can do a contour plot (or heat map, or whatever) of xIndex, yIndex, and z. But I know I'm not doing a "Pythonic" way of solving this, and there's just no way for me to run over the hundreds of millions of data points I have for this in anything approaching a reasonable timeframe.

How do I do this the right way? Thanks for reading!

You can truncate the code dramatically.

First, since you have a linear scale at which you're binning, you can eliminate the explicit arrays xIndex and yIndex entirely. You can express the exact indices into z as

xi = np.round(np.random.random(2500) * 100).astype(int)
yi = np.round(np.random.random(2500) * 1000).astype(int)

Second, you don't need the loop. The issue with the normal + operator (akanp.add ) is that it's buffered. A consequence of that is that you won't get the right count for multiple occurrencs of the same index. Fortunately, ufuncs have an at method to handle that, and add is a ufunc.

Third, and finally, broadcasting allows you to specify how to mesh the arrays for a fancy index:

np.add.at(z, (xi[:, None], yi), 1)

If you're building a 2D histogram, you don't need to round the raw data. You can round just the indices instead:

x = np.random.random(2500)
y = np.random.random(2500) * 10

z = np.zeros((100,1000))
np.add.at(z, (np.round(100 * x).astype(int), np.round(100 * y).astype(int)), 1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM