延伸numpy面具

Question

I want to mask a numpy array a with mask . 我想掩盖numpy的阵列a与mask 。 The mask doesn't have exactly the same shape as a , but it is possible to mask a anyway (I guess because of the additional dimension being 1-dimensional (broadcasting?)). 面具不具有完全相同的形状为a ，但它有可能掩盖a反正（我猜是因为额外的维度是一维（广播？））。

a.shape
>>> (3, 9, 31, 2, 1)
mask.shape
>>> (3, 9, 31, 2)
masked_a = ma.masked_array(a, mask)

The same logic however, does not apply to array b which has 5 elements in its last dimension. 然而，相同的逻辑不适用于在其最后维度中具有5个元素的阵列b 。

ext_mask = mask[..., np.newaxis] # extending or not extending has same effect
ext_mask.shape
>>> (3, 9, 31, 2, 1)

b.shape
>>> (3, 9, 31, 2, 5)
masked_b = ma.masked_array(b, ext_mask)
>>> numpy.ma.core.MaskError: Mask and data not compatible: data size is 8370, mask size is 1674.

How can I create a (3, 9, 31, 2, 5) mask from a (3, 9, 31, 2) mask by expanding any True value in the last dimension of the (3, 9, 31, 2) mask to [True, True, True, True, True] (and False respectively)? 如何通过展开(3, 9, 31, 2, 5) 3,9,31,2 (3, 9, 31, 2)蒙版的最后一个维度中的任何True值，从(3, 9, 31, 2) True (3, 9, 31, 2)蒙版创建（3,9,31,2,5 (3, 9, 31, 2)蒙版到[True, True, True, True, True] （和分别为False ）？

Answer 1

This gives the desired result: 这给出了期望的结果：

masked_b = ma.masked_array(*np.broadcast(b, ext_mask))

I have not profiled this method, but it should be faster than allocating a new mask. 我没有描述过这种方法，但它应该比分配一个新的掩码更快。 According to the documentation , no data is copied: 根据文档，没有数据被复制：

These arrays are views on the original arrays. 这些数组是原始数组的视图。 They are typically not contiguous. 它们通常不是连续的。 Furthermore, more than one element of a broadcasted array may refer to a single memory location. 此外，广播阵列的多于一个元素可以指代单个存储位置。 If you need to write to the arrays, make copies first. 如果需要写入数组，请先进行复制。

It is possible to verify the no-copying behavior: 可以验证无复制行为：

bb, mb = np.broadcast(b, ext_mask)
print(mb.shape)       # (3, 9, 31, 2, 5) - same shape as b
print(mb.base.shape)  # (3, 9, 31, 2) - the shape of the original mask
print(mb.strides)     # (558, 62, 2, 1, 0) - that's how it works: 0 stride

Pretty impressive how the numpy developers implemented broadcasting. numpy开发人员如何实现广播，令人印象深刻。 Values are repeated by using a stride of 0 along the last dimension. 通过沿最后一个维度使用步长0来重复值。 Whow! Whow！

Edit 编辑

I compared the speed of broadcasting and allocating with this code: 我将广播和分配的速度与此代码进行了比较：

import numpy as np
from numpy import ma

a = np.random.randn(30, 90, 31, 2, 1)
b = np.random.randn(30, 90, 31, 2, 5)

mask = np.random.randn(30, 90, 31, 2) > 0
ext_mask = mask[..., np.newaxis]

def broadcasting(a=a, b=b, ext_mask=ext_mask):
    mb1 = ma.masked_array(*np.broadcast_arrays(b, ext_mask))

def allocating(a=a, b=b, ext_mask=ext_mask):
    m2 = np.empty(b.shape, dtype=bool)
    m2[:] = ext_mask
    mb2 = ma.masked_array(b, m2)

Broadcasting is clearly faster than allocating, here: 广播显然比分配更快，在这里：

    # array size: (30, 90, 31, 2, 5)

In [23]: %timeit broadcasting()
The slowest run took 10.39 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 39.4 µs per loop

In [24]: %timeit allocating()
The slowest run took 4.86 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 982 µs per loop

Note that I had to increase array size for the difference in speed to become apparent. 请注意，我必须增加数组大小才能显示速度差异。 With the original array dimensions allocating was slightly faster than broadcasting: 使用原始数组维度分配比广播稍快：

    # array size: (3, 9, 31, 2, 5)

In [28]: %timeit broadcasting()
The slowest run took 9.36 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 39 µs per loop

In [29]: %timeit allocating()
The slowest run took 9.22 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 32.6 µs per loop

The broadcasting solution's runtime seems not to depend on array size. 广播解决方案的运行时似乎不依赖于数组大小。

延伸numpy面具

问题描述

1 个解决方案

解决方案1
3 已采纳 2016-02-16 15:28:23

延伸numpy面具

问题描述

1 个解决方案

解决方案1 3 已采纳 2016-02-16 15:28:23

解决方案1
3 已采纳 2016-02-16 15:28:23