简体   繁体   English

在 N 维数组中查找最近的转换

[英]Find Nearest Transition in N-dimensional Array

I want to find index of the nearest transition in a numpy ndarray of integers given a current index in an efficient manner.我想以有效的方式在给定当前索引的整数的 numpy ndarray 中找到最近转换的索引。 Transition means a change of value.转型意味着价值的改变。

For example, in the 2D array below, the right output for location (2,4) would be (3,6) (transition from Class 1 to Class 8) at the approximate distance of 2.236.例如,在下面的二维数组中,位置 (2,4) 的右侧 output 将是 (3,6)(从 Class 1 到 Class 8.6 的转换) In case of more than one optimum, returning any would suffice.如果有多个最优值,返回任何一个就足够了。

import seaborn as sns
import numpy as np

step_size = [1,1]  # size of step in each dimension
arr = np.array([[6,6,1,1,1,1,1,1,1,8],[6,1,1,1,1,1,1,1,8,8],[6,1,1,1,1,1,1,1,8,8],[6,1,1,1,1,1,8,8,8,8],[6,6,1,1,1,1,1,8,8,8]])
sns.heatmap(arr, annot=True, cbar=False)

在此处输入图像描述

The application is to estimate distances to boundaries.该应用程序是估计到边界的距离。 Such as a sample's distance to decision boundary in classification algorithms where an accurate algorithm or formula for that is not available (like xgboost , and unlike SVM and decision trees ).例如样本到分类算法中决策边界的距离,其中没有准确的算法或公式(如xgboost ,与SVM决策树不同)。

To detect a change, perform some type of derivative on the matrix.要检测变化,请对矩阵执行某种类型的导数。 This can be done with a Laplacian kernel.这可以通过拉普拉斯算子 kernel 来完成。 In this kernel, you may define what type of change you are looking for, but for this, I simply told it to look for change in the left, right, up, and down squares (not the corners).在这个 kernel 中,您可以定义要查找的更改类型,但为此,我只是告诉它在左、右、上和下正方形(而不是角)中查找更改。 Then find the closest square which has some change in it.然后找到最近的有一些变化的正方形。 This will not produce the same result as you wish, but rather will find the closest square which is associated with the boundary ((3, 5) in your example).这不会产生您希望的相同结果,而是会找到与边界关联的最近的正方形(在您的示例中为 (3, 5) )。 You may modify it, as well as handle multiple optimums, to your liking.您可以根据自己的喜好对其进行修改,以及处理多个优化。 As far as efficiency goes, I'm not saying this is going to win any prizes, but notice that any looping is happening implicitly within numpy/scipy operations (which are optimized).就效率而言,我并不是说这会赢得任何奖品,但请注意任何循环都隐含在 numpy/scipy 操作(已优化)中发生。

import numpy as np
import scipy

# This function referenced from: 
#   https://stackoverflow.com/questions/61628380/calculate-distance-from-all- 
#   points-in-numpy-array-to-a-single-point-on-the-basis
def distmat_v2(a, index):
    i,j = np.indices(a.shape, sparse=True)
    return np.sqrt((i-index[0])**2 + (j-index[1])**2)

# Your original array.
arr = np.array([[6,6,1,1,1,1,1,1,1,8],[6,1,1,1,1,1,1,1,8,8], 
      [6,1,1,1,1,1,1,1,8,8],[6,1,1,1,1,1,8,8,8,8],[6,6,1,1,1,1,1,8,8,8]])

# Extract size of array for converting np.argmin output to array location.
r, c = np.shape(arr)

# Define and apply the edge detection kernel to the array. Then test logical 
#   equality with zero. This will set a 'True' for every element that had some 
#   type of change detected and 'False' otherwise.
edge_change_detection_kernel = np.array([[0,-1,0],[-1,4,-1],[0,-1,0]])
# edge_and_corner_change_detection_kernel = np.array([[-1,-1,-1],[-1,8,-1], 
#      [-1,-1,-1]])
change_arr = scipy.signal.convolve2d(arr, edge_change_detection_kernel, 
      boundary='symm', mode='same') != 0

# Calculate the distance from a certain point, say (2, 4), to all other points 
#   in the matrix.
distances_from_point = distmat_v2(arr, (2, 4))

# Element-wise multiply the logical matrix with the distances matrix. This keeps 
#   all the distances which had a change in the same square. It discards the 
#   distances which didn't have a change in the square.
changed_locations_distances_from_point = np.multiply(change_arr, 
      distances_from_point)

# Test for the smallest distance that is left after keeping only the changing 
#   squares. However, some locations will have zero in them, so set those values 
#   to infinity before testing the smallest value. This will find the smallest, 
#   non-zero distance.
closest_change_square_to_point = np.where(
  changed_locations_distances_from_point > 0, 
  changed_locations_distances_from_point, np.inf).argmin()

# Find the array location where the argmin function found the appropriate value.
print(f'row {int(closest_change_square_to_point/c)}')
print(f'col {closest_change_square_to_point%c}')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM