[英]How to elegantly drop unnecessary elements in numpy?
I have an ndarray of shape [batch_size, seq_len, num_features]
. 我有一个形状为
[batch_size, seq_len, num_features]
。 However, some of elements in the end of the sequential dimension is not necessary, and therefore I want to drop them and merge the sequential dimension into the batch dimension. 但是,顺序维末尾的某些元素不是必需的,因此我想删除它们并将顺序维合并到批处理维中。 For example, the ndarray
a
I want to manipulate is 例如,ndarray
a
我想要操作的
batch_size = 2
seq_len = 3
num_features = 1
a = np.random.randn(batch_size, seq_len, num_features)
mask = np.ones((batch_size, seq_len), dtype=np.bool)
mask[0][1:] = 0
mask[1][2:] = 0
"""
>>> a = [[[-0.3908401 ]
[ 0.89686512]
[ 0.07594243]]
[[-0.12256737]
[-1.00838131]
[ 0.56543754]]]
mask=[[ True False False]
[ True True False]]
"""
where mask
is used to indicate whether the elements in a
is useful. 其中
mask
用于指示a
的元素是否有用。 I can get what I want using the following code 我可以使用下面的代码得到我想要的
res = []
for seq, m in zip(a, mask):
res.append(seq[:sum(m)])
np.concatenate(res, axis=0)
"""
>>>array([[0.08676509],
[0.47162315],
[0.98070665]])
"""
I'm wondering if there is a more elegant way to do this in numpy? 我想知道是否有更优雅的方法可以在numpy中执行此操作?
不确定这是否是您的要问,但结果看起来还不错
res = a[mask]
Since dimensions related to batch and seq are going to be merged, you could reshape both a
and mask
to 2D array of shape (batch_size * seq_len, num_features)
. 由于与batch和seq有关的尺寸将要合并,因此您可以将
a
和mask
重塑为2D形状的数组(batch_size * seq_len, num_features)
。
Next, simply filter important samples using boolean index. 接下来,只需使用布尔索引过滤重要样本。 See the code:
看代码:
mask2d = mask.reshape(-1) # or mask.ravel()
a2d = a.reshape(-1, num_features)
result = a2d[mask2d]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.