如何优雅地在numpy中删除不必要的元素？

Question

I have an ndarray of shape [batch_size, seq_len, num_features] . 我有一个形状为[batch_size, seq_len, num_features] 。 However, some of elements in the end of the sequential dimension is not necessary, and therefore I want to drop them and merge the sequential dimension into the batch dimension. 但是，顺序维末尾的某些元素不是必需的，因此我想删除它们并将顺序维合并到批处理维中。 For example, the ndarray a I want to manipulate is 例如，ndarray a我想要操作的

batch_size = 2
seq_len = 3
num_features = 1
a = np.random.randn(batch_size, seq_len, num_features)
mask = np.ones((batch_size, seq_len), dtype=np.bool)
mask[0][1:] = 0
mask[1][2:] = 0
"""
>>> a = [[[-0.3908401 ]
  [ 0.89686512]
  [ 0.07594243]]

 [[-0.12256737]
  [-1.00838131]
  [ 0.56543754]]]
mask=[[ True False False]
 [ True  True False]]
"""

where mask is used to indicate whether the elements in a is useful. 其中mask用于指示a的元素是否有用。 I can get what I want using the following code 我可以使用下面的代码得到我想要的

res = []
for seq, m in zip(a, mask):
    res.append(seq[:sum(m)])
np.concatenate(res, axis=0)
"""
>>>array([[0.08676509],
       [0.47162315],
       [0.98070665]])
"""

I'm wondering if there is a more elegant way to do this in numpy? 我想知道是否有更优雅的方法可以在numpy中执行此操作？

Answer 1

不确定这是否是您的要问，但结果看起来还不错

res = a[mask]

Answer 2

Since dimensions related to batch and seq are going to be merged, you could reshape both a and mask to 2D array of shape (batch_size * seq_len, num_features) . 由于与batch和seq有关的尺寸将要合并，因此您可以将a和mask重塑为2D形状的数组(batch_size * seq_len, num_features) 。

Next, simply filter important samples using boolean index. 接下来，只需使用布尔索引过滤重要样本。 See the code: 看代码：

mask2d = mask.reshape(-1) # or mask.ravel()
a2d = a.reshape(-1, num_features)
result = a2d[mask2d]

如何优雅地在numpy中删除不必要的元素？

问题描述

2 个解决方案

解决方案1
1 已采纳 2019-11-24 02:23:57

解决方案2
0 2019-11-24 18:47:04

如何优雅地在numpy中删除不必要的元素？

问题描述

2 个解决方案

解决方案1 1 已采纳 2019-11-24 02:23:57

解决方案2 0 2019-11-24 18:47:04

解决方案1
1 已采纳 2019-11-24 02:23:57

解决方案2
0 2019-11-24 18:47:04