简体   繁体   中英

How to map corresponding values of a 2D NumPy array into an 1D array

I have written this piece of code:

data = np.array([[3,6], [5,9], [4, 8]])

orig_x, orig_y = np.split(data, 2, axis=1)

x = np.array([3, 4])
y = np.zeros((len(x)))

for i in range(len(x)):
    y[i] = orig_y[np.where(orig_x == x[i])[0]]

So basically, I have a 2D NumPy array. I split it into two 1D arrays orig_x and orig_y, one storing values of the x-axis and the other values of the y-axis.

I also have another 1D NumPy array, which has some of the values that exist in the orig_x array. I want to find the y-axis values for each value in the x array. I created this method, using a simple loop, but it is extremely slow since I'm using it with thousands of values.

Do you have a better idea? Maybe by using a NumPy function?

Note: Also a better title for this question can be made. Sorry:(

You could create a mask over which values you want from the x column and then use this mask to select values from the y column.

data = np.array([[3,6], [5,9], [4, 8]])

# the values you want to lookup on the x-axis
x = np.array([3, 4])

mask = np.isin(data[:,0], x)
data[mask,1]

Output:

array([6, 8])

The key function here is to use np.isin . What this is basically doing is broadcasting x or data to the appropriate shape and doing an element-wise comparison:

mask = data[:,0,None] == x
y_mask = np.logical_or.reduce(mask, axis=1)
data[y_mask, 1]

Output:

array([6, 8])

I'm not 100% sure I understood the problem correctly, but I think the following should work:

>>> rows, cols = np.where(orig_x == x)
>>> y = orig_y[rows[np.argsort(cols)]].ravel()
>>> y
array([6, 8])

It assumes that all the values in orig_x are unique, but since your code example has the same restriction, I considered it a given.

What about a lookup table?

import numpy as np
data = np.array([[3,6], [5,9], [4, 8]])

orig_x, orig_y = np.split(data, 2, axis=1)

x = np.array([3, 4])
y = np.zeros((len(x)))

You can pack a dict for lookup:

lookup = {i: j for i, j in zip(orig_x.ravel(), orig_y.ravel())}

And just map this into a new array:

np.fromiter(map(lambda i: lookup.get(i, np.nan), x), dtype=int, count=len(x))
array([6, 8])

If orig_x & orig_y are your smaller data structures this will probably be most efficient.

EDIT - It's occurred to me that if your values are integers the default np.nan won't work and you should figure out what value makes sense for your application if you're trying to find a value that isn't in your orig_x array.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM