简体   繁体   English

从数组中随机选择正负数据

[英]Randomly selecting positive and negative data from array

I have written the following function: 我写了以下函数:

def searchPosotive (X,y, num):
    pos = sample(list(compress(X, y)), num)
    return (pos)

This function takes in two numpy matrix's, X and y . 该函数接受两个numpy矩阵Xy These two arrays are related ie. 这两个数组是相关的。 X[i] is the label for y[i] . X[i]y[i]的标签。 The label is either a 1 or a 0. 标签为1或0。

This function randomly picks num values from X whose equivalent y value is equal to 1 and returns an (num, n) array where n is the number of columns in X . 此函数从X中随机选择num值,其等效y值等于1,并返回一个(num, n)数组,其中n是X的列数。

I need to get a list of the index values for which it contains. 我需要获取它包含的索引值的列表。 For example, if pos[a] == X[a] , a would need to be in that list. 例如,如果pos[a] == X[a] ,则a必须在该列表中。 How can I do this? 我怎样才能做到这一点?

I also need to do this for when I am looking for negative examples. 当我要寻找负面例子时,我也需要这样做。 The current function I use is: 我当前使用的功能是:

def searchNegative (X,y, num):
    mat=X[y==0]
    rows = np.random.choice(len(mat), size=num,replace=False)
    mat=mat[rows,:]
    return (mat)

You want to use np.where to get the indices of your positive (or negative) Y's. 您想使用np.where来获取正(或负)Y的索引。 Then, sample from the indices . 然后, 从索引中取样 Here's a function for positive, you can either modify it to let you select positive or negative, or write another function just for negative: First, assume: 这是一个用于正数的函数,您可以修改它以选择正数或负数,也可以编写另一个仅用于负数的函数:首先,假设:

>>> y
array([1, 0, 1, 1, 1, 0, 0, 1, 0, 1])
>>> X
array([[-25,  62,  94,  70,  96,  70,  38, -18, -57,   1],
       [ 40,  86, -98, -48,  40,  29,   4, -83,  44, -12],
       [ 57,  23, -96,  97, -24, -93, -33, -64,  61,  15],
       [ 44,  29,  31, -38,  11,  85,  37, -96, -37, -70],
       [-10, -37, -24, -66,  27, -44, -16, -50,   3, -91],
       [-97,  81,  52,  41,  39, -14,  95,  76,  28, -32],
       [-74,  49, -91, -65, -96,  86, -13,  43,  22,  80],
       [  5,  20, -77,  74, -89,  46, -90,  95,  30,  13],
       [ 36,   6,  55, -74, -49, -66,  38,  37, -84,  28],
       [-23, -28, -32, -30,  -4, -52,  -4,  99, -67, -98]])

And so... 所以...

>>> def sample_positive(X, y, num):
...     pos_index = np.where(y == 1)[0]
...     rows = np.random.choice(pos_index, size=num, replace=False)
...     mat = X[rows,:]
...     return (mat, rows)
...
>>> X_sample, idx = sample_positive(X, y, 2)
>>> X_sample
array([[-23, -28, -32, -30,  -4, -52,  -4,  99, -67, -98],
       [-10, -37, -24, -66,  27, -44, -16, -50,   3, -91]])
>>> idx
array([9, 4])
>>> X
array([[-25,  62,  94,  70,  96,  70,  38, -18, -57,   1],
       [ 40,  86, -98, -48,  40,  29,   4, -83,  44, -12],
       [ 57,  23, -96,  97, -24, -93, -33, -64,  61,  15],
       [ 44,  29,  31, -38,  11,  85,  37, -96, -37, -70],
       [-10, -37, -24, -66,  27, -44, -16, -50,   3, -91],
       [-97,  81,  52,  41,  39, -14,  95,  76,  28, -32],
       [-74,  49, -91, -65, -96,  86, -13,  43,  22,  80],
       [  5,  20, -77,  74, -89,  46, -90,  95,  30,  13],
       [ 36,   6,  55, -74, -49, -66,  38,  37, -84,  28],
       [-23, -28, -32, -30,  -4, -52,  -4,  99, -67, -98]])
>>> y
array([1, 0, 1, 1, 1, 0, 0, 1, 0, 1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM