numpy 掩码数组限制掩码值的频率

Question

Starting from an array:从数组开始：

a = np.array([1,1,1,2,3,4,5,5])

and a filter:和一个过滤器：

m = np.array([1,5])

I am now building a mask with:我现在正在制作一个面具：

b = np.in1d(a,m)

that correctly returns:正确返回：

array([ True,  True,  True, False, False, False,  True,  True], dtype=bool)

I would need to limit the number of boolean True s for unique values to a maximum value of 2, so that 1 is masked only two times instead of three).我需要将唯一值的 boolean True s 的数量限制为最大值 2，以便 1 仅被屏蔽两次而不是三次）。 The resulting mask would then appear (no matter the order of the first real True values):然后将得到的面具似乎（无论是第一次真正的秩序True值）：

array([ True,  True,  False, False, False, False,  True,  True], dtype=bool)

or或者

array([ True,  False,  True, False, False, False,  True,  True], dtype=bool)

or或者

array([ False,  True,  True, False, False, False,  True,  True], dtype=bool)

Ideally this is a kind of "random" masking over a limited frequency of values.理想情况下，这是对有限频率值的一种“随机”掩蔽。 So far I tried to random select the original unique elements in the array, but actually the mask select the True values no matter their frequency.到目前为止，我尝试随机选择数组中的原始唯一元素，但实际上掩码选择了True值，无论它们的频率如何。

Answer 1

For a generic case with unsorted input array, here's one approach based on np.searchsorted -对于未排序输入数组的通用情况，这是一种基于np.searchsorted -

N = 2 # Parameter to decide how many duplicates are allowed

sortidx = a.argsort()
idx = np.searchsorted(a,m,sorter=sortidx)[:,None] + np.arange(N)
lim_counts = (a[:,None] == m).sum(0).clip(max=N)
idx_clipped = idx[lim_counts[:,None] > np.arange(N)]
out = np.in1d(np.arange(a.size),idx_clipped)[sortidx.argsort()]

Sample run -样品运行 -

In [37]: a
Out[37]: array([5, 1, 4, 2, 1, 3, 5, 1])

In [38]: m
Out[38]: [1, 2, 5]

In [39]: N
Out[39]: 2

In [40]: out
Out[40]: array([ True, True, False, True, True, False, True, False], dtype=bool)

numpy 掩码数组限制掩码值的频率

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-04-06 12:56:30

numpy 掩码数组限制掩码值的频率

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-04-06 12:56:30

解决方案1
2 已采纳 2016-04-06 12:56:30