[英]How to debug numpy masks
This question is related to this one .这个问题与这个有关。
I have a function that I'm trying to vectorize.我有一个正在尝试矢量化的 function。 This is the original function:这是原 function:
def aspect_good(angle: float, planet1_good: bool, planet2_good: bool):
"""
Decides if the angle represents a good aspect.
NOTE: returns None if the angle doesn't represent an aspect.
"""
if 112 <= angle <= 128 or 52 <= angle <= 68:
return True
elif 174 <= angle <= 186 or 84 <= angle <= 96:
return False
elif 0 <= angle <= 8 and planet1_good and planet2_good:
return True
elif 0 <= angle <= 6:
return False
else:
return None
and this is what I have so far:这就是我到目前为止所拥有的:
def aspects_good(
angles: npt.ArrayLike,
planets1_good: npt.ArrayLike,
planets2_good: npt.ArrayLike,
) -> npt.NDArray:
"""
Decides if the angles represent good aspects.
Note: this function was contributed by Mad Physicist. Thank you.
https://stackoverflow.com/q/73672739/11004423
:returns: an array with values as follows:
1 – the angle is a good aspect
0 – the angle is a bad aspect
-1 – the angle doesn't represent an aspect
"""
result = np.full_like(angles, -1, dtype=np.int8)
bad_mask = np.abs(angles % 90) <= 6
result[bad_mask] = 0
good_mask = (np.abs(angles - 120) <= 8) |\
(np.abs(angles - 60) <= 8) |\
((np.abs(angles - 4) <= 4) & planets1_good & planets2_good)
result[good_mask] = 1
It's not working as expected, however, I wrote a test with pytest:它没有按预期工作,但是,我用 pytest 编写了一个测试:
def test_aspects_good():
tests = np.array((
(120, True, False, True),
(60, True, False, True),
(180, True, False, False),
(90, True, False, False),
(129, True, False, -1),
(111, True, False, -1),
(69, True, False, -1),
(51, True, False, -1),
(187, True, False, -1),
(173, True, False, -1),
(97, True, False, -1),
(83, True, False, -1),
(0, True, True, True),
(0, True, False, False),
(0, False, True, False),
(0, False, False, False),
(7, False, False, -1),
(7, True, True, True),
(9, True, True, -1),
))
angles = tests[:, 0]
planets1_good = tests[:, 1]
planets2_good = tests[:, 2]
expected = tests[:, 3]
result = aspects_good(angles, planets1_good, planets2_good)
assert np.array_equal(result, expected)
and it fails, saying False, the arrays are different.它失败了,说 False,arrays 是不同的。
Here I have result
and expected
arrays combined side by side:在这里,我得到了result
和expected
的 arrays 并排组合:
array([[ 1, 1],
│ [ 1, 1],
│ [ 0, 0],
│ [ 0, 0],
│ [-1, -1],
│ [-1, -1],
│ [-1, -1],
│ [-1, -1],
│ [-1, -1],
│ [-1, -1],
│ [-1, -1],
│ [-1, -1],
│ [ 0, 1],
│ [ 0, 0],
│ [ 0, 0],
│ [ 0, 0],
│ [-1, -1],
│ [-1, 1],
│ [-1, -1]])
Note: the first column is result
array, and the second one is expected
.注意:第一列是result
数组,第二列是expected
的。 As you can see, they differ in two places.如您所见,它们在两个地方有所不同。 Now the question comes "How to debug this?"现在问题来了“如何调试这个?” Normally I would use a debugger, and step through each if/elif/else condition.通常我会使用调试器,并逐步检查每个 if/elif/else 条件。 But I have no idea how to debug numpy masks.但我不知道如何调试 numpy 掩码。
The issue appears to be a combination of three things:这个问题似乎是三件事的结合:
Numpy uses a homogeneous type throughout an array. Numpy 在整个数组中使用同构类型。
You will find that tests.dtype
is dtype('int64')
or dtype('int32')
depending on your architecture.您会发现tests.dtype
是 dtype dtype('int64')
或dtype('int32')
取决于您的架构。 This means that the columns containing planet1_good
and planet2_good
are integers too, not booleans.这意味着包含planet1_good
和planet2_good
的列也是整数,而不是布尔值。
Bitwise AND ( &
) is not a logical operator.按位与 ( &
)不是逻辑运算符。
A bitwise AND operation will return a result with the largest of the input types.按位与运算将返回具有最大输入类型的结果。 Specifically for the result of <=
, which is a boolean, and an int
array, the result will be an int
.特别是对于<=
的结果,即boolean 和一个int
数组,结果将是一个int
。 That means that you can do something like np.array([1, 2]) & np.array([True, True])
to get array([1, 0])
, not array([True, False])
.这意味着您可以执行类似np.array([1, 2]) & np.array([True, True])
的操作来获取array([1, 0])
,而不是array([True, False])
。
Numpy distinguishes between a boolean mask and a fancy index by the dtype, even if the fancy index contains only zeros and ones. Numpy 按 dtype 区分 boolean 掩码和花式索引,即使花式索引仅包含零和一。 If you have a 2 element array, x
, then x[[True, True]] = 1
assigns 1
to both elements of x
.如果您有一个 2 元素数组x
,则x[[True, True]] = 1
将1
分配给x
的两个元素。 However, x[[1, 1]] = 1
assigns 1
only to the second element of x
.但是, x[[1, 1]] = 1
1
将 1 分配给x
的第二个元素。
So that's basically what's happening here.所以这基本上就是这里发生的事情。 bad_mask
is a boolean mask, and works exactly as you would expect. bad_mask
是一个 boolean 掩码,完全按照您的预期工作。 However, good_mask
ANDs with a couple of integer arrays, so becomes an integer array containing zeros and ones.但是, good_mask
与一对 integer arrays 进行 AND 运算,因此变为包含零和一的 integer 数组。 The expression result[good_mask] = 1
is actually assigning the first and second element of result
to be 1
, which fortuitously correspond to two of your tests.表达式result[good_mask] = 1
实际上将result
的第一个和第二个元素分配为1
,这恰好对应于您的两个测试。 The remaining True
results can not and will not be assigned 1
.剩下的True
结果不能也不会被赋值为1
。
There are a few ways to fix this, listed in decreasing order of preference (my favorite on top):有几种方法可以解决这个问题,按偏好降序排列(我最喜欢的在上面):
Convert all your arrays to numpy arrays of the correct type.将所有 arrays 转换为正确类型的 numpy arrays。 Right now your function does not meet the contract that it accepts any array-like.现在你的 function 不符合它接受任何类似数组的合同。 If you pass in a list for angles
, you will get TypeError: unsupported operand type(s) for %: 'list' and 'int'
.如果你传入一个angles
列表,你会得到TypeError: unsupported operand type(s) for %: 'list' and 'int'
。 This is a fairly idiomatic approach:这是一种相当惯用的方法:
angles = np.asanyarray(angles) planets1_good = np.asanyarray(planets1_good, dtype=bool) planets2_good = np.asanyarray(planets2_good, dtype=bool) result = np.full_like(angles, -1, dtype=np.int8) bad_mask = np.abs(angles % 90) <= 6 result[bad_mask] = 0 good_mask = (np.abs(angles - 120) <= 8) |\ (np.abs(angles - 60) <= 8) |\ ((np.abs(angles - 4) <= 4) & planets1_good & planets2_good) result[good_mask] = 1 return result
Ensure that good_mask
is actually a mask before applying it.在应用之前确保good_mask
实际上是一个掩码。 You should still convert angles
, but the other arrays will be converted automatically by the &
operator:您仍应转换angles
,但其他 arrays 将由&
运算符自动转换:
good_mask = ((np.abs(angles - 120) <= 8) |\ (np.abs(angles - 60) <= 8) |\ ((np.abs(angles - 4) <= 4) & planets1_good & planets2_good)).astype(bool)
You may alternatively do something similar to what you did with bad_mask
:你也可以做一些类似于你对bad_mask
所做的事情:
good_mask = (np.abs(angles % 60) <= 8) & (angles >= -8) & (angles <= 128)
Convert the mask to an index, which won't care about the original dtype:将掩码转换为索引,它不会关心原始数据类型:
result[np.flatnonzero(good_mask)] = 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.