简体   繁体   English

如何优雅地在numpy中删除不必要的元素?

[英]How to elegantly drop unnecessary elements in numpy?

I have an ndarray of shape [batch_size, seq_len, num_features] . 我有一个形状为[batch_size, seq_len, num_features] However, some of elements in the end of the sequential dimension is not necessary, and therefore I want to drop them and merge the sequential dimension into the batch dimension. 但是,顺序维末尾的某些元素不是必需的,因此我想删除它们并将顺序维合并到批处理维中。 For example, the ndarray a I want to manipulate is 例如,ndarray a我想要操作的

batch_size = 2
seq_len = 3
num_features = 1
a = np.random.randn(batch_size, seq_len, num_features)
mask = np.ones((batch_size, seq_len), dtype=np.bool)
mask[0][1:] = 0
mask[1][2:] = 0
"""
>>> a = [[[-0.3908401 ]
  [ 0.89686512]
  [ 0.07594243]]

 [[-0.12256737]
  [-1.00838131]
  [ 0.56543754]]]
mask=[[ True False False]
 [ True  True False]]
"""

where mask is used to indicate whether the elements in a is useful. 其中mask用于指示a的元素是否有用。 I can get what I want using the following code 我可以使用下面的代码得到我想要的

res = []
for seq, m in zip(a, mask):
    res.append(seq[:sum(m)])
np.concatenate(res, axis=0)
"""
>>>array([[0.08676509],
       [0.47162315],
       [0.98070665]])
"""

I'm wondering if there is a more elegant way to do this in numpy? 我想知道是否有更优雅的方法可以在numpy中执行此操作?

不确定这是否是您的要问,但结果看起来还不错

res = a[mask]

Since dimensions related to batch and seq are going to be merged, you could reshape both a and mask to 2D array of shape (batch_size * seq_len, num_features) . 由于与batch和seq有关的尺寸将要合并,因此您可以将amask重塑为2D形状的数组(batch_size * seq_len, num_features)

Next, simply filter important samples using boolean index. 接下来,只需使用布尔索引过滤重要样本。 See the code: 看代码:

mask2d = mask.reshape(-1) # or mask.ravel()
a2d = a.reshape(-1, num_features)
result = a2d[mask2d]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM