简体   繁体   English

从2D数组中,不使用循环就从第1个数组(不在行之间共享的值)创建唯一(非重复)随机选择值的第2个2D数组

[英]From a 2D array, create 2nd 2D array of Unique(non-repeated) random selected values from 1st array (values not shared among rows) without using a loop

This is a follow up on this question. 这是此问题的后续措施。

From a 2d array, create another 2d array composed of randomly selected values from original array (values not shared among rows) without using a loop 从2d数组创建另一个2d数组,该数组由从原始数组中随机选择的值组成(行之间不共享的值),而无需使用循环

I am looking for a way to create a 2D array whose rows are randomly selected unique values (non-repeating) from another row, without using a loop. 我正在寻找一种创建2D数组的方法,该数组的行是从另一行中随机选择的唯一值(非重复),而不使用循环。

Here is a way to do it With using a loop. 这是使用循环的一种方法。

pool =  np.random.randint(0, 30, size=[4,5])
seln = np.empty([4,3], int)

for i in range(0, pool.shape[0]):
    seln[i] =np.random.choice(pool[i], 3, replace=False) 

print('pool = ', pool)
print('seln = ', seln)

>pool =  [[ 1 11 29  4 13]
 [29  1  2  3 24]
 [ 0 25 17  2 14]
 [20 22 18  9 29]]
seln =  [[ 8 12  0]
 [ 4 19 13]
 [ 8 15 24]
 [12 12 19]]

Here is a method that does not uses a loop, however, it can select the same value multiple times in each row. 这是一种不使用循环的方法,但是,它可以在每一行中多次选择相同的值。

pool =  np.random.randint(0, 30, size=[4,5])
print(pool)
array([[ 4, 18,  0, 15,  9],
       [ 0,  9, 21, 26,  9],
       [16, 28, 11, 19, 24],
       [20,  6, 13,  2, 27]])

# New array shape
new_shape = (pool.shape[0],3)

# Indices where to randomly choose from
ix = np.random.choice(pool.shape[1], new_shape)
array([[0, 3, 3],
       [1, 1, 4],
       [2, 4, 4],
       [1, 2, 1]])

ixs = (ix.T + range(0,np.prod(pool.shape),pool.shape[1])).T
array([[ 0,  3,  3],
       [ 6,  6,  9],
       [12, 14, 14],
       [16, 17, 16]])

pool.flatten()[ixs].reshape(new_shape)
array([[ 4, 15, 15],
       [ 9,  9,  9],
       [11, 24, 24],
       [ 6, 13,  6]]) 

I am looking for a method that does not use a loop, and if a particular value from a row is selected, that value can Not be selected again. 我正在寻找一种不使用循环的方法,并且如果选择了行中的特定值,则无法再次选择该值。

Here is a way without explicit looping. 这是一种没有显式循环的方法。 However, it requires generating an array of random numbers of the size of the original array. 但是,它需要生成一个原始数组大小的随机数数组。 That said, the generation is done using compiled code so it should be pretty fast. 就是说,生成是使用编译后的代码完成的,因此它应该非常快。 It can fail if you happen to generate two identical numbers, but the chance of that happening is essentially zero. 如果您碰巧产生两个相同的数字,则可能会失败,但是发生这种情况的机会实际上为零。

m,n = 4,5 
pool =  np.random.randint(0, 30, size=[m,n])

new_width = 3
mask = np.argsort(np.random.rand(m,n))<new_width

pool[mask].reshape(m,3)

How it works: We generate a random array of floats, and argsort it. 工作原理:我们生成一个随机的float数组,并对其进行argsort。 By default, when artsort is applied to a 2d array it is applied along axis 1 so the value of the i,j entry of the argsorted list is what place the j -th entry of the i -th row would appear if you sorted the i -th row. 默认情况下,当artsort应用于二维数组是沿轴1施加这样的值i,j的argsorted列表的条目是什么地方j中的个进入i个行会,如果你排序出现第i行。

We then find all the values in this array where the entries whose values are less than new_width . 然后,我们在此数组中找到所有值小于new_width的条目。 Each row contains the numbers 0,...,n-1 in a random order, so exactly new_width of them will be less than new_width . 每行以随机顺序包含数字0,...,n-1 ,因此它们的new_width恰好小于new_width This means each row of mask will have exactly new_width number of entries which are True , and the rest will be False (when you use a boolean operator between a ndarray and a scalar it applies it component-wise). 这意味着mask每一行将具有完全为new_width的条目数,它们为True ,其余的将为False (当您在ndarray和标量之间使用布尔运算符时,它将按组件应用)。

Finally, the boolean mask is applied to the original data to grab new_width many entries from each row. 最后,将布尔掩码应用于原始数据以从每一行中获取new_width许多条目。

You could also use np.vectorize for your loop solution, although that is just shorthand for a loop. 您也可以将np.vectorize用于循环解决方案,尽管这只是循环的简写。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM