I have two numpy arrays, like X=[x1,x2,x3,x4], y=[y1,y2,y3,y4]
. Three of the elements are close and the fourth of them maybe close or not.
Like:
X [ 84.04467948 52.42447842 39.13555678 21.99846595]
y [ 78.86529444 52.42447842 38.74910101 21.99846595]
Or it can be:
X [ 84.04467948 60 52.42447842 39.13555678]
y [ 78.86529444 52.42447842 38.74910101 21.99846595]
I want to define a function to find the the corresponding index in the two arrays, like in first case:
y[0]
correspond to X[0]
, y[1]
correspond to X[1]
, y[2]
correspond to X[2]
, y[3]
correspond to X[3]
And in second case:
y[0]
correspond to X[0]
, y[1]
correspond to X[2]
, y[2]
correspond to X[3]
y[3]
correspond to X[1]
. I can't write a function to solve the problem completely, please help.
Using this answer https://stackoverflow.com/a/8929827/3627387 and https://stackoverflow.com/a/12141207/3627387
FIXED
def find_closest(alist, target):
return min(alist, key=lambda x:abs(x-target))
X = [ 84.04467948, 52.42447842, 39.13555678, 21.99846595]
Y = [ 78.86529444, 52.42447842, 38.74910101, 21.99846595]
def list_matching(list1, list2):
list1_copy = list1[:]
pairs = []
for i, e in enumerate(list2):
elem = find_closest(list1_copy, e)
pairs.append([i, list1.index(elem)])
list1_copy.remove(elem)
return pairs
You can start by precomputing the distance matrix as show in this answer :
import numpy as np
X = np.array([84.04467948,60.,52.42447842,39.13555678])
Y = np.array([78.86529444,52.42447842,38.74910101,21.99846595])
dist = np.abs(X[:, np.newaxis] - Y)
Now you can compute the minimums along one axis (I chose 1
corresponding to finding the closest element of Y
for every X
):
potentialClosest = dist.argmin(axis=1)
This still may contain duplicates (in your case 2). To check for that, you can find find all Y
indices that appear in potentialClosest
by use of np.unique
:
closestFound, closestCounts = np.unique(potentialClosest, return_counts=True)
Now you can check for duplicates by checking if closestFound.shape[0] == X.shape[0]
. If so, you're golden and potentialClosest
will contain your partners for every element in X
. In your case 2 though, one element will occur twice and therefore closestFound
will only have X.shape[0]-1
elements whereas closestCounts
will not contain only 1
s but one 2
. For all elements with count 1
the partner is already found. For the two candidates with count 2
, though you will have to choose the closer one while the partner of the one with the larger distance will be the one element of Y
which is not in closestFound
. This can be found as:
missingPartnerIndex = np.where(
np.in1d(np.arange(Y.shape[0]), closestFound)==False
)[0][0]
You can do the matchin in a loop (even though there might be some nicer way using numpy
). This solution is rather ugly but works. Any suggestions for improvements are very appreciated:
partners = np.empty_like(X, dtype=int)
nonClosePartnerFound = False
for i in np.arange(X.shape[0]):
if closestCounts[closestFound==potentialClosest[i]][0]==1:
# A unique partner was found
partners[i] = potentialClosest[i]
else:
# Partner is not unique
if nonClosePartnerFound:
partners[i] = potentialClosest[i]
else:
if np.argmin(dist[:, potentialClosest[i]]) == i:
partners[i] = potentialClosest[i]
else:
partners[i] = missingPartnerIndex
nonClosePartnerFound = True
print(partners)
This answer will only work if only one pair is not close. If that is not the case, you will have to define how to find the correct partner for multiple non-close elements. Sadly it's neither a very generic nor a very nice solution, but hopefully you will find it a helpful starting point.
Seems like best approach would be to pre-sort both array (n log(n)) and then perform merge-like traverse through both arrays. It's definitely faster than n n which you indicated in comment.
The below simply prints the corresponding indexes of the two arrays as you have done in your question as I'm not sure what output you want your function to give.
X1 = [84.04467948, 52.42447842, 39.13555678, 21.99846595]
Y1 = [78.86529444, 52.42447842, 38.74910101, 21.99846595]
X2 = [84.04467948, 60, 52.42447842, 39.13555678]
Y2 = [78.86529444, 52.42447842, 38.74910101, 21.99846595]
def find_closest(x_array, y_array):
# Copy x_array as we will later remove an item with each iteration and
# require the original later
remaining_x_array = x_array[:]
for y in y_array:
differences = []
for x in remaining_x_array:
differences.append(abs(y - x))
# min_index_remaining is the index position of the closest x value
# to the given y in remaining_x_array
min_index_remaining = differences.index(min(differences))
# related_x is the closest x value of the given y
related_x = remaining_x_array[min_index_remaining]
print 'Y[%s] corresponds to X[%s]' % (y_array.index(y), x_array.index(related_x))
# Remove the corresponding x value in remaining_x_array so it
# cannot be selected twice
remaining_x_array.pop(min_index_remaining)
This then outputs the following
find_closest(X1,Y1)
Y[0] corresponds to X[0]
Y[1] corresponds to X[1]
Y[2] corresponds to X[2]
Y[3] corresponds to X[3]
and
find_closest(X2,Y2)
Y[0] corresponds to X[0]
Y[1] corresponds to X[2]
Y[2] corresponds to X[3]
Y[3] corresponds to X[1]
Hope this helps.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.