如何调试 numpy 掩码

Question

This question is related to this one .这个问题与这个有关。

I have a function that I'm trying to vectorize.我有一个正在尝试矢量化的 function。 This is the original function:这是原 function：

def aspect_good(angle: float, planet1_good: bool, planet2_good: bool):
    """
    Decides if the angle represents a good aspect.
    NOTE: returns None if the angle doesn't represent an aspect.
    """

    if 112 <= angle <= 128 or 52 <= angle <= 68:
        return True
    elif 174 <= angle <= 186 or 84 <= angle <= 96:
        return False
    elif 0 <= angle <= 8 and planet1_good and planet2_good:
        return True
    elif 0 <= angle <= 6:
        return False
    else:
        return None

and this is what I have so far:这就是我到目前为止所拥有的：

def aspects_good(
    angles: npt.ArrayLike,
    planets1_good: npt.ArrayLike,
    planets2_good: npt.ArrayLike,
) -> npt.NDArray:
    """
    Decides if the angles represent good aspects.

    Note: this function was contributed by Mad Physicist. Thank you.
    https://stackoverflow.com/q/73672739/11004423

    :returns: an array with values as follows:
        1 – the angle is a good aspect
        0 – the angle is a bad aspect
       -1 – the angle doesn't represent an aspect
    """
    result = np.full_like(angles, -1, dtype=np.int8)

    bad_mask = np.abs(angles % 90) <= 6
    result[bad_mask] = 0

    good_mask = (np.abs(angles - 120) <= 8) |\
                (np.abs(angles - 60) <= 8) |\
                ((np.abs(angles - 4) <= 4) & planets1_good & planets2_good)
    result[good_mask] = 1

It's not working as expected, however, I wrote a test with pytest:它没有按预期工作，但是，我用 pytest 编写了一个测试：

def test_aspects_good():
    tests = np.array((
        (120, True, False, True),
        (60, True, False, True),
        (180, True, False, False),
        (90, True, False, False),

        (129, True, False, -1),
        (111, True, False, -1),
        (69, True, False, -1),
        (51, True, False, -1),
        (187, True, False, -1),
        (173, True, False, -1),
        (97, True, False, -1),
        (83, True, False, -1),

        (0, True, True, True),
        (0, True, False, False),
        (0, False, True, False),
        (0, False, False, False),

        (7, False, False, -1),
        (7, True, True, True),
        (9, True, True, -1),
    ))

    angles = tests[:, 0]
    planets1_good = tests[:, 1]
    planets2_good = tests[:, 2]
    expected = tests[:, 3]

    result = aspects_good(angles, planets1_good, planets2_good)
    assert np.array_equal(result, expected)

and it fails, saying False, the arrays are different.它失败了，说 False，arrays 是不同的。

Here I have result and expected arrays combined side by side:在这里，我得到了result和expected的 arrays 并排组合：

array([[ 1,  1],
│      [ 1,  1],
│      [ 0,  0],
│      [ 0,  0],
│      [-1, -1],
│      [-1, -1],
│      [-1, -1],
│      [-1, -1],
│      [-1, -1],
│      [-1, -1],
│      [-1, -1],
│      [-1, -1],
│      [ 0,  1],
│      [ 0,  0],
│      [ 0,  0],
│      [ 0,  0],
│      [-1, -1],
│      [-1,  1],
│      [-1, -1]])

Note: the first column is result array, and the second one is expected .注意：第一列是result数组，第二列是expected的。 As you can see, they differ in two places.如您所见，它们在两个地方有所不同。 Now the question comes "How to debug this?"现在问题来了“如何调试这个？” Normally I would use a debugger, and step through each if/elif/else condition.通常我会使用调试器，并逐步检查每个 if/elif/else 条件。 But I have no idea how to debug numpy masks.但我不知道如何调试 numpy 掩码。

Answer 1

The issue appears to be a combination of three things:这个问题似乎是三件事的结合：

Numpy uses a homogeneous type throughout an array. Numpy 在整个数组中使用同构类型。
You will find that tests.dtype is dtype('int64') or dtype('int32') depending on your architecture.您会发现tests.dtype是 dtype dtype('int64')或dtype('int32')取决于您的架构。 This means that the columns containing planet1_good and planet2_good are integers too, not booleans.这意味着包含planet1_good和planet2_good的列也是整数，而不是布尔值。
Bitwise AND ( & ) is not a logical operator.按位与 ( & )不是逻辑运算符。
A bitwise AND operation will return a result with the largest of the input types.按位与运算将返回具有最大输入类型的结果。 Specifically for the result of <= , which is a boolean, and an int array, the result will be an int .特别是对于<=的结果，即boolean 和一个int数组，结果将是一个int 。 That means that you can do something like np.array([1, 2]) & np.array([True, True]) to get array([1, 0]) , not array([True, False]) .这意味着您可以执行类似np.array([1, 2]) & np.array([True, True])的操作来获取array([1, 0]) ，而不是array([True, False]) 。
Numpy distinguishes between a boolean mask and a fancy index by the dtype, even if the fancy index contains only zeros and ones. Numpy 按 dtype 区分 boolean 掩码和花式索引，即使花式索引仅包含零和一。 If you have a 2 element array, x , then x[[True, True]] = 1 assigns 1 to both elements of x .如果您有一个 2 元素数组x ，则x[[True, True]] = 1将1分配给x的两个元素。 However, x[[1, 1]] = 1 assigns 1 only to the second element of x .但是， x[[1, 1]] = 1 1将 1 分配给x的第二个元素。

So that's basically what's happening here.所以这基本上就是这里发生的事情。 bad_mask is a boolean mask, and works exactly as you would expect. bad_mask是一个 boolean 掩码，完全按照您的预期工作。 However, good_mask ANDs with a couple of integer arrays, so becomes an integer array containing zeros and ones.但是， good_mask与一对 integer arrays 进行 AND 运算，因此变为包含零和一的 integer 数组。 The expression result[good_mask] = 1 is actually assigning the first and second element of result to be 1 , which fortuitously correspond to two of your tests.表达式result[good_mask] = 1实际上将result的第一个和第二个元素分配为1 ，这恰好对应于您的两个测试。 The remaining True results can not and will not be assigned 1 .剩下的True结果不能也不会被赋值为1 。

There are a few ways to fix this, listed in decreasing order of preference (my favorite on top):有几种方法可以解决这个问题，按偏好降序排列（我最喜欢的在上面）：

Convert all your arrays to numpy arrays of the correct type.将所有 arrays 转换为正确类型的 numpy arrays。 Right now your function does not meet the contract that it accepts any array-like.现在你的 function 不符合它接受任何类似数组的合同。 If you pass in a list for angles , you will get TypeError: unsupported operand type(s) for %: 'list' and 'int' .如果你传入一个angles列表，你会得到TypeError: unsupported operand type(s) for %: 'list' and 'int' 。 This is a fairly idiomatic approach:这是一种相当惯用的方法：
```
 angles = np.asanyarray(angles) planets1_good = np.asanyarray(planets1_good, dtype=bool) planets2_good = np.asanyarray(planets2_good, dtype=bool) result = np.full_like(angles, -1, dtype=np.int8) bad_mask = np.abs(angles % 90) <= 6 result[bad_mask] = 0 good_mask = (np.abs(angles - 120) <= 8) |\ (np.abs(angles - 60) <= 8) |\ ((np.abs(angles - 4) <= 4) & planets1_good & planets2_good) result[good_mask] = 1 return result
```
Ensure that good_mask is actually a mask before applying it.在应用之前确保good_mask实际上是一个掩码。 You should still convert angles , but the other arrays will be converted automatically by the & operator:您仍应转换angles ，但其他 arrays 将由&运算符自动转换：
```
 good_mask = ((np.abs(angles - 120) <= 8) |\ (np.abs(angles - 60) <= 8) |\ ((np.abs(angles - 4) <= 4) & planets1_good & planets2_good)).astype(bool)
```
You may alternatively do something similar to what you did with bad_mask :你也可以做一些类似于你对bad_mask所做的事情：
```
 good_mask = (np.abs(angles % 60) <= 8) & (angles >= -8) & (angles <= 128)
```
Convert the mask to an index, which won't care about the original dtype:将掩码转换为索引，它不会关心原始数据类型：
```
 result[np.flatnonzero(good_mask)] = 1
```

如何调试 numpy 掩码

问题描述

1 个解决方案

解决方案1
1 2022-09-14 02:27:01

如何调试 numpy 掩码

问题描述

1 个解决方案

解决方案1 1 2022-09-14 02:27:01

解决方案1
1 2022-09-14 02:27:01