简体   繁体   中英

Sorting a NumPy array

I have a np array that is constructed as the intersection of 2 other arrays in the following way:

The first array is:

[['! ! !' '! ! ! !']
 ['! ! !' '! ! ! "']
 ['! ! !' '! ! ! .']
 ...,
 ['}' 'was postponed']
 ['}' 'was']
 ['}' '{of']]

The second array is:

[['! ! !' '! ! ! !']
 ['! ! !' '! ! ! "']
 ['! ! !' '! ! ! .']
 ...,
 ['}' 'was postponed']
 ['}' 'was']
 ['}' '{of']]

There are in fact multiple differences between the two arrays, but they occur mainly in the middle rows.

The code used to construct the intersection is:

def multidim_intersect(arr1, arr2):
    arr1_view = arr1.view([('',arr1.dtype)]*arr1.shape[1])
    arr2_view = arr2.view([('',arr2.dtype)]*arr2.shape[1])
    intersected = np.intersect1d(arr1_view, arr2_view)
    return intersected.view(arr1.dtype).reshape(-1, arr1.shape[1])

The outputted array is:

[['!' '!']
 ['!' '! !']
 ['!' '! ! !']
 ...,
 ['}' 'was']
 ['}' 'was postponed']
 ['}' '{of']]

As you can see, my new array is sorted differently from the original two arrays (which have multiple exclamation marks sorted before single exclamation marks, as would be done in LC_ALL=C sort). Is there any way to sort my outputted array like my other arrays? Note that the shape of the array is important.

@Mr E arr1 and arr2 were originally lists. I can't give you the exact copy, but I will do my best to construct an example which illustrates what I need.

arr1 = [['! ! !' '! ! ! !']
 ['! ! !' '! ! ! "']
 ['! ! !' '! ! ! .']
 ['!' '!']
 ['}' 'was postponed']
 ['}' 'was']
 ['}' '{of']]

arr2 = [['! ! !' '! ! ! !']
 ['! ! !' '! ! ! "']
 ['! ! !' '! ! ! .']
 ['!' '!']
 ['}' 'was postponed']
 ['}' 'was']
 ['}' '{of']]

Ideally, the output would be:

[['! ! !' '! ! ! !']
 ['! ! !' '! ! ! "']
 ['! ! !' '! ! ! .']
 ['!' '!']
 ['}' 'was postponed']
 ['}' 'was']
 ['}' '{of']]

but instead it is:

[['!' '!']
 ['! ! !' '! ! ! !']
 ['! ! !' '! ! ! "']
 ['! ! !' '! ! ! .']
 ['}' 'was postponed']
 ['}' 'was']
 ['}' '{of']]

or something to that effect.

I don't really understand the format of your input, but you can adapt this to suit your needs.

The problem is that numpy.intersect1d() automatically sorts the output, for some reason. Luckily it's not that hard to write your own intersection function using numpy.in1d() . You can do something like this:

import numpy as np

arr1 = np.array([['! ! !' '! ! ! !'],
 ['! ! !' '! ! ! "'],
 ['! ! !' '! ! ! .'],
 ['!' '!'],
 ['a' 'ad'],   # Stuff you don't want to get back
 ['}' 'was postponed'],
 ['}' 'was'],
 ['}' '{of']])

arr2 = np.array([['! ! !' '! ! ! !'],
 ['! ! !' '! ! ! "'],
 ['! ! !' '! ! ! .'],
 ['!' '!'],
 ['b' 'ab'],   # Stuff you don't want to get back
 ['}' 'was postponed'],
 ['}' 'was'],
 ['}' '{of']])

inarr = np.in1d(arr1, arr2)

arr3 = np.empty( shape=(0, 0) )

for i in np.arange(len(arr1)):
  if (inarr[i]):
    arr3 = np.append(arr3,arr1[i])

for i in np.arange(len(arr3)):
  print(arr3[i])

The output:

! ! !! ! ! !
! ! !! ! ! "
! ! !! ! ! .
!!
}was postponed
}was
}{of

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM