[英]Randomly selecting positive and negative data from array
I have written the following function: 我写了以下函数:
def searchPosotive (X,y, num):
pos = sample(list(compress(X, y)), num)
return (pos)
This function takes in two numpy matrix's, X
and y
. 该函数接受两个numpy矩阵
X
和y
。 These two arrays are related ie. 这两个数组是相关的。
X[i]
is the label for y[i]
. X[i]
是y[i]
的标签。 The label is either a 1 or a 0. 标签为1或0。
This function randomly picks num
values from X whose equivalent y
value is equal to 1 and returns an (num, n)
array where n is the number of columns in X
. 此函数从X中随机选择
num
值,其等效y
值等于1,并返回一个(num, n)
数组,其中n是X
的列数。
I need to get a list of the index values for which it contains. 我需要获取它包含的索引值的列表。 For example, if
pos[a] == X[a]
, a
would need to be in that list. 例如,如果
pos[a] == X[a]
,则a
必须在该列表中。 How can I do this? 我怎样才能做到这一点?
I also need to do this for when I am looking for negative examples. 当我要寻找负面例子时,我也需要这样做。 The current function I use is:
我当前使用的功能是:
def searchNegative (X,y, num):
mat=X[y==0]
rows = np.random.choice(len(mat), size=num,replace=False)
mat=mat[rows,:]
return (mat)
You want to use np.where
to get the indices of your positive (or negative) Y's. 您想使用
np.where
来获取正(或负)Y的索引。 Then, sample from the indices . 然后, 从索引中取样 。 Here's a function for positive, you can either modify it to let you select positive or negative, or write another function just for negative: First, assume:
这是一个用于正数的函数,您可以修改它以选择正数或负数,也可以编写另一个仅用于负数的函数:首先,假设:
>>> y
array([1, 0, 1, 1, 1, 0, 0, 1, 0, 1])
>>> X
array([[-25, 62, 94, 70, 96, 70, 38, -18, -57, 1],
[ 40, 86, -98, -48, 40, 29, 4, -83, 44, -12],
[ 57, 23, -96, 97, -24, -93, -33, -64, 61, 15],
[ 44, 29, 31, -38, 11, 85, 37, -96, -37, -70],
[-10, -37, -24, -66, 27, -44, -16, -50, 3, -91],
[-97, 81, 52, 41, 39, -14, 95, 76, 28, -32],
[-74, 49, -91, -65, -96, 86, -13, 43, 22, 80],
[ 5, 20, -77, 74, -89, 46, -90, 95, 30, 13],
[ 36, 6, 55, -74, -49, -66, 38, 37, -84, 28],
[-23, -28, -32, -30, -4, -52, -4, 99, -67, -98]])
And so... 所以...
>>> def sample_positive(X, y, num):
... pos_index = np.where(y == 1)[0]
... rows = np.random.choice(pos_index, size=num, replace=False)
... mat = X[rows,:]
... return (mat, rows)
...
>>> X_sample, idx = sample_positive(X, y, 2)
>>> X_sample
array([[-23, -28, -32, -30, -4, -52, -4, 99, -67, -98],
[-10, -37, -24, -66, 27, -44, -16, -50, 3, -91]])
>>> idx
array([9, 4])
>>> X
array([[-25, 62, 94, 70, 96, 70, 38, -18, -57, 1],
[ 40, 86, -98, -48, 40, 29, 4, -83, 44, -12],
[ 57, 23, -96, 97, -24, -93, -33, -64, 61, 15],
[ 44, 29, 31, -38, 11, 85, 37, -96, -37, -70],
[-10, -37, -24, -66, 27, -44, -16, -50, 3, -91],
[-97, 81, 52, 41, 39, -14, 95, 76, 28, -32],
[-74, 49, -91, -65, -96, 86, -13, 43, 22, 80],
[ 5, 20, -77, 74, -89, 46, -90, 95, 30, 13],
[ 36, 6, 55, -74, -49, -66, 38, 37, -84, 28],
[-23, -28, -32, -30, -4, -52, -4, 99, -67, -98]])
>>> y
array([1, 0, 1, 1, 1, 0, 0, 1, 0, 1])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.