[英]numpy mask array limiting the frequency of masked values
Starting from an array:从数组开始:
a = np.array([1,1,1,2,3,4,5,5])
and a filter:和一个过滤器:
m = np.array([1,5])
I am now building a mask with:我现在正在制作一个面具:
b = np.in1d(a,m)
that correctly returns:正确返回:
array([ True, True, True, False, False, False, True, True], dtype=bool)
I would need to limit the number of boolean True
s for unique values to a maximum value of 2, so that 1 is masked only two times instead of three).我需要将唯一值的 boolean
True
s 的数量限制为最大值 2,以便 1 仅被屏蔽两次而不是三次)。 The resulting mask would then appear (no matter the order of the first real True
values):然后将得到的面具似乎(无论是第一次真正的秩序
True
值):
array([ True, True, False, False, False, False, True, True], dtype=bool)
or或者
array([ True, False, True, False, False, False, True, True], dtype=bool)
or或者
array([ False, True, True, False, False, False, True, True], dtype=bool)
Ideally this is a kind of "random" masking over a limited frequency of values.理想情况下,这是对有限频率值的一种“随机”掩蔽。 So far I tried to random select the original unique elements in the array, but actually the mask select the
True
values no matter their frequency.到目前为止,我尝试随机选择数组中的原始唯一元素,但实际上掩码选择了
True
值,无论它们的频率如何。
For a generic case with unsorted input array, here's one approach based on np.searchsorted
-对于未排序输入数组的通用情况,这是一种基于
np.searchsorted
-
N = 2 # Parameter to decide how many duplicates are allowed
sortidx = a.argsort()
idx = np.searchsorted(a,m,sorter=sortidx)[:,None] + np.arange(N)
lim_counts = (a[:,None] == m).sum(0).clip(max=N)
idx_clipped = idx[lim_counts[:,None] > np.arange(N)]
out = np.in1d(np.arange(a.size),idx_clipped)[sortidx.argsort()]
Sample run -样品运行 -
In [37]: a
Out[37]: array([5, 1, 4, 2, 1, 3, 5, 1])
In [38]: m
Out[38]: [1, 2, 5]
In [39]: N
Out[39]: 2
In [40]: out
Out[40]: array([ True, True, False, True, True, False, True, False], dtype=bool)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.