简体   繁体   中英

Mix three vectors in a numpy array, then sort it

I have 3 numpy.ndarray vectors, X, Y and intensity. I would like to mix it in an numpy array, then sort by the third column (or the first one). I tried the following code:

m=np.column_stack((X,Y))
m=np.column_stack((m,intensity))
m=np.sort(m,axis=2)

Then I got the error: ValueError: axis(=2) out of bounds.

When I print m, I get:

array([[  109430,   285103,      121],
   [  134497,   284907,      134],
   [  160038,   285321,      132],
   ..., 
   [12374406,  2742429,      148],
   [12371858,  2741994,      148],
   [12372221,  2742017,      161]])

How can I fix it. that is, get a sorted array?

Axis=2 does not refer to the column index but rather, to the dimension of the array. It means numpy will try to look for a third dimension in the data and sorts it from smallest to largest in the third dimension. Sorting from smallest to largest in the first dimension ( axis = 0 ) would be have the values in all rows going from smallest to largest. Sorting from smallest to largest in the second dimension ( axis = 1 ) would be have the values in all columns going from smallest to largest. Examples would be below.

Furthermore, sort would work differently depending on the base array. Two arrays are considered: Unstructured and structured.

Unstructured

X = np.nrandn(10)
X = np.nrandn(10)
intensity = np.nrandn(10)
m=np.column_stack((X,Y))
m=np.column_stack((m,intensity))

m is being treated as an unstructured array because there are no fields linked to any of the columns. In other words, if you call np.sort() on m , it will just sort them from smallest to largest from top to bottom if axis=0 and left to right if axis=1 . The rows are not being preserved.

Original :

[[ 1.20122251  1.41451461 -1.66427245]
 [ 1.3657312  -0.2318793  -0.23870104]
 [-0.30280613  0.79123814 -1.64082042]]

Axis=1 :

[[-1.66427245  1.20122251  1.41451461]
 [-0.23870104 -0.2318793   1.3657312 ]
 [-1.64082042 -0.30280613  0.79123814]]

Axis = 0 :

[[-0.30280613 -0.2318793  -1.66427245]
 [ 1.20122251  0.79123814 -1.64082042]
 [ 1.3657312   1.41451461 -0.23870104]]

Structured

As you can see, the data structure in the rows is not kept. If you would like to preserve the row order, you need to add in labels to the datatypes and create an array with this. You can sort by the other columns with order = label_name .

dtype = [("a",float),("b",float),("c",float)]
m = [tuple(x) for x in m]
labelled_arr = np.array(m,dtype)
print np.sort(labelled_arr,order="a")

This will get:

[(-0.30280612629541204, 0.7912381363389004, -1.640820419927318)
 (1.2012225144719493, 1.4145146097431947, -1.6642724545574712)
 (1.3657312047892836, -0.23187929505306418, -0.2387010374198555)]

Another more convenient way of doing this would be passing the data into a pandas dataframe which automatically creates column names from 0 to n-1 . Then you can just call the sort_values method and pass in the column index you want and follow it by axis=0 if you would like it to be sorted from top to bottom just like in numpy .

Example:

pd.DataFrame(m).sort_values(0,axis = 0)

Output:

          0         1         2
2 -0.302806  0.791238 -1.640820
0  1.201223  1.414515 -1.664272
1  1.365731 -0.231879 -0.238701

You are getting that error because you don't have an axis with a 2 index. Axes are zero-indexed. Regardless, np.sort will sort every column, or every row. Consider from the docs :

order : str or list of str, optional When a is an array with fields defined, this argument specifies which fields to compare first, second, etc. A single field can be specified as a string, and not all fields need be specified, but unspecified fields will still be used, in the order in which they come up in the dtype, to break ties.

For example:

In [28]: a
Out[28]: 
array([[0, 0, 1],
       [1, 2, 3],
       [3, 1, 8]])

In [29]: np.sort(a, axis = 0)
Out[29]: 
array([[0, 0, 1],
       [1, 1, 3],
       [3, 2, 8]])

In [30]: np.sort(a, axis = 1)
Out[30]: 
array([[0, 0, 1],
       [1, 2, 3],
       [1, 3, 8]])

So, I think what you really want is this neat little idiom:

In [32]: a[a[:,2].argsort()]
Out[32]: 
array([[0, 0, 1],
       [1, 2, 3],
       [3, 1, 8]])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM