在 numpy 数组 A = [[x0, y0, z0], [x1, y1, z1]] 中映射 z's 数组 B = [[x1, y1, ?], [x0, y0, ?]] 的第 3 列匹配（x，y）？

Question

我有一个 numpy 数组T其行具有以下列结构： [x, y, value] ，其中 x, y, value 是整数。 示例T数组如下所示：

[[1, 0, 4],
 [0, 2, 3],
 [1, 2, 7]]

此数据来自 model ，其中第三列指定元组(x, y)的变量值。 在 model 中，此元组对应于值的 label。 例如，我的 label T_10 （下标10 ）的值为4 ， T_02的值为3 ， T_12的值为7 。

现在，我想交换一对标签。 例如，我想将所有标签2替换为1 （反之亦然），以分别获得前面示例的T_20 、 T_01和T_21 。 所以，这个新数据是

U = [[2, 0, ?],
     [0, 1, ?],
     [2, 1, ?]]

我的问题是我不知道如何使我的新数据看起来像这样：

U = [[2, 0, -3]
     [0, 1, -4],
     [2, 1, -7]]

这个新数据应该遵循两个规则：

首先，它应该正确识别T的第一列和第二列(x, y) ) 与U中的新(x, y)相同的行。 对于U的每一行，如果T的有序对(x, y) = (x, y) ，则适当的 '?' U的第三列中的值应该是T中的对应值。

第二：另一方面，如果(x, y) of U = (y, x) of T ，那么它应该是相应value的负数。

我的尝试涉及首先提取T的列，然后使用以下 function 交换这对标签：

def swap_indices(a, pair):
    for n, i in enumerate(a):
        if i == pair[0]: # check whether a0's element is = swap element 1
            a[n] = pair[1]
        elif i == pair[1]: 
            a[n] = pair[0]
    return a

例如，我会将 label 0与1交换，反之亦然，用于x列和y列，使用：

pair = (0, 1)
a0 = swap_indices(T[:,0], pair) # column x  
a1 = swap_indices(T[:,1], pair) # column y

然后我遍历T的行数； num_rows_of_T ：

for k in range(num_rows_of_T):
    temp = np.where((T[k, 0] == a0[k]) & (T[k, 1] == a1[k]) | ((T[k, 0] == a1[k]) & (T[k, 1] == a0[k])))

上面，我试图获取(x, y) of U = (x, y) of T或(x, y) of U = (y, x) of T的行的索引。 然而，这是我卡住的地方。 我不认为以上是正确的。 此外，这种方法不会让我应用第二条规则，即如果(x, y) = (y, x)取T值的负数。 我还尝试使用set()作为初学者（以获得无序对），但即使在那时我也无法正确找到T的相应值。

基本上，我想找到与U中的新标签匹配的T value s 。 我的数据很好，因为可能只存在一组可能的坐标，并且T和U的(x,y)之间总是存在双射映射（给定我的两个规则）。

有什么建议吗？ 请根据需要帮助编辑问题。 我很难问。

这是一个最小的工作示例：

import numpy as np

# swap index labels if match swap pair
def swap_indices(a, pair):
    for n, i in enumerate(a):
        if i == pair[0]: # check whether a0's element is = swap element 1
            a[n] = pair[1]
        elif i == pair[1]: 
            a[n] = pair[0]
    return a
        
def find_valid_swaps(The1 = np.array([1, 0, -1, 1, 0, 1]), headers = np.array(['10', '20', '21', '30', '31', '32'])):
 
    num_indices = len(The1)
    T = np.zeros((num_indices,3)); U = T;
    
    # match format given for T in question
    for i in range(num_indices):
        T[i,:] = [int(list(headers[i])[0]), int(list(headers[i])[1]), The1[i]]
    
    pair = (0, 1) # label pair to swap
    a0 = swap_indices(T[:, 0], pair) # column 0 of U
    a1 = swap_indices(T[:, 1], pair) # column 1 of U
    
    # try to extract correct 'value' from T based on new labels in U
    for k in range(num_indices):
        temp = np.where((T[k, 0] == a0[k]) & (T[k, 1] == a1[k]) | ((T[k, 0] == a1[k]) & (T[k, 1] == a0[k])))
        print("temp",temp[0][0])
        U[k, :] = [a0[k], a1[k], T[temp[0][0], 2]] # here, I would finally create the new U matrix, applying both rules

    print(U)

find_valid_swaps()

使用@MadPhysicist 的答案的更多相关示例：

# swap index labels if match swap pair
def swap_indices(a, pair):
    for n, i in enumerate(a):
        if i == pair[0]: # check whether a0's element is = swap element 1
            a[n] = pair[1]
        elif i == pair[1]: 
            a[n] = pair[0]
    return a
    
def key(arr, m):
    return arr[:, 0] * m + arr[:, 1]
    
def find_valid_swaps(Thetas1 = np.array([1, 1, 0, 0, -1, -1]), Thetas2 = np.array([1, 0, -1, 1, 0, 1]), num_bands = 4, headers = np.array(['10', '20', '21', '30', '31', '32'])):
    
    import itertools # for permutations: https://stackoverflow.com/questions/40092474/get-all-pairwise-combinations-from-a-list
    
    if (Thetas1==Thetas2).all():
        print("Warning: Input sets of indices are equal to each other. Will check other possible permutations regardless.")
    else: 
        print("Input sets of indices are unique. Will proceed checking other viable permutations.")

    num_indices = len(Thetas1)
    
    T = np.zeros((num_indices,3))
    U = np.zeros((num_indices,3))
    
    for i in range(num_indices):
        T[i,:] = [int(list(headers[i])[0]), int(list(headers[i])[1]), Thetas2[i]]
    
    print("input T")
    print(T)
    
    pair = (2,3)
    a0 = swap_indices(T[:,0], pair) # column 1  
    a1 = swap_indices(T[:,1], pair) # column 2 
    
    
    for k in range(num_indices):
        U[k, :] = [a0[k], a1[k], 0] 

    # below code due to @MadPhysicist from https://stackoverflow.com/questions/67223782/mapping-zs-in-numpy-array-a-x0-y0-z0-x1-y1-z1-for-3rd-column-of-ar/67235030?noredirect=1#67235030
    
    y_max = T[:, 1].max() + 1
    Tkey = key(T, y_max)
    s = np.argsort(Tkey)

    Ukey = key(U, y_max)
    i = np.searchsorted(Tkey, Ukey, sorter=s)
    i[i == len(i)] -= 1  # cleanup indices that won't match anyway
    mask = (Ukey == Tkey[s[i]])

    U2key = key(U[~mask, 1::-1], y_max)
    j = np.searchsorted(Tkey, U2key, sorter=s)
   
    U[mask, -1] = T[s[i[mask]], -1]
    U[~mask, -1] = -T[s[j], -1]
    
    print("reordered U")
    print(U)

以上给出了output：

input T
[[ 1.,  0.,  1.]
 [ 2.,  0.,  0.]
 [ 2.,  1., -1.]
 [ 3.,  0.,  1.]
 [ 3.,  1.,  0.]
 [ 3.,  2.,  1.]]
reordered U
[[ 1.,  0.,  1.]
 [ 3.,  0.,  0.]
 [ 3.,  1., -1.]
 [ 2.,  0.,  1.]
 [ 2.,  1.,  0.]
 [ 2.,  3.,  1.]]

Answer 1

您可以将算法归结为三个大步骤：

排序 Txy
在 Txy 中对 Uxy 进行二分搜索
在 Txy 中对剩余 Uyx 进行二分搜索

组合结果显然是微不足道的。 整个操作在O(N log N)时间内应该是完全可行的，因为这就是每一步需要多长时间。

由于np.searchsorted是第 2 步和第 3 步的主要候选者，假设您可以将前两列转换为唯一键。 例如，假设在所有情况下y <= y_max ，并且y_max有一个合理的界限，使得x * y_max + y <= 2**32-1对于所有x 。 您可以在闲暇时使用np.int64或使用x_max而不是y_max 。

所以现在你做：

def key(arr, m):
    return arr[:, 0] * m + arr[:, 1]

y_max = T[:, :1].max(None) + 1
Tkey = key(T, y_max)
s = np.argsort(Tkey)

要查找U的哪些元素匹配：

Ukey = key(U, y_max)
i = np.searchsorted(Tkey, Ukey, sorter=s)
i[i == len(i)] -= 1  # cleanup indices that won't match anyway
mask = (Ukey == Tkey[s[i]])

现在找到反向索引。

U2key = key(U[~mask, 1::-1], y_max)
j = np.searchsorted(Tkey, U2key, sorter=s)

由于映射是双射的，所以这一步只搜索保证存在的元素，不需要验证索引。

现在您可以组合索引。 如果U还没有第三列，请添加一列：

U = np.concatenate((U, np.empty_like(T[:, :1])), axis=1)

使用我们计算的索引，提取您想要的Tsort元素：

U[mask, -1] = T[s[i[mask]], -1]
U[~mask, -1] = -T[s[j], -1]

现在，如果您无法获得像key工作这样的映射，事情可能会更复杂一些。 如果没有其他方法，请先尝试

def key(arr):
    return arr[:, 0] + 1j * arr[:, 1]

复杂值将仅用作排序键，仅用作排序键。 如果失败，您可能必须定义结构化数据类型并通过它查看您的数组以使搜索正常工作。 您当然可以实现分层搜索，但我觉得这超出了 scope 的范围。

这是一个基于您的T的完整玩具示例，稍微修改了U ，在最后一列中显示正数和负数：

>>> T = np.array([[1, 0, 4],
                  [0, 2, 3],
                  [1, 2, 7]])
>>> U = np.array([[2, 1, 0],
                  [1, 0, 0],
                  [2, 0, 0]])
>>> def key(arr, m):
...     return arr[:, 0] * m + arr[:, 1]

>>> y_max = T[:, :1].max(None) + 1
>>> Tkey = key(T, y_max)
>>> s = np.argsort(Tkey)

>>> Ukey = key(U, y_max)
>>> i = np.searchsorted(Tkey, Ukey, sorter=s)
>>> i[i == len(i)] -= 1  # cleanup indices that won't match anyway
>>> mask = (Ukey == Tkey[s[i]])

>>> U2key = key(U[~mask, 1::-1], y_max)
>>> j = np.searchsorted(Tkey, U2key, sorter=s)

>>> U[mask, -1] = T[s[i[mask]], -1]
>>> U[~mask, -1] = -T[s[j], -1]
>>> print(U)
[[ 2  1 -7]
 [ 1  0  4]
 [ 2  0 -3]]

在 numpy 数组 A = [[x0, y0, z0], [x1, y1, z1]] 中映射 z's 数组 B = [[x1, y1, ?], [x0, y0, ?]] 的第 3 列匹配（x，y）？

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-04-23 18:16:43

在 numpy 数组 A = [[x0, y0, z0], [x1, y1, z1]] 中映射 z's 数组 B = [[x1, y1, ?], [x0, y0, ?]] 的第 3 列匹配（x，y）？

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-04-23 18:16:43

解决方案1
1 已采纳 2021-04-23 18:16:43