简体   繁体   English

如何在numpy数组中按行随机分配值

[英]How to randomly assign values row-wise in a numpy array

My google-fu has failed me! 我的Google Fu使我失败了! I have a 10x10 numpy array initialized to 0 as follows: 我有一个10x10的numpy数组,初始化为0 ,如下所示:

arr2d = np.zeros((10,10))

For each row in arr2d , I want to assign 3 random columns to 1 . 对于arr2d每一行,我想将3个随机列分配给1 I am able to do it using a loop as follows: 我可以使用如下循环来做到这一点:

for row in arr2d:
    rand_cols = np.random.randint(0,9,3)
    row[rand_cols] = 1

output: 输出:

array([[ 0.,  0.,  0.,  0.,  0.,  0.,  1.,  1.,  1.,  0.],
   [ 0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  1.,  0.],
   [ 0.,  0.,  1.,  0.,  1.,  1.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  1.,  1.,  1.,  0.,  0.,  0.],
   [ 1.,  0.,  0.,  1.,  1.,  0.,  0.,  0.,  0.,  0.],
   [ 1.,  0.,  1.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  1.,  0.,  0.,  0.,  0.,  1.,  0.,  1.,  0.],
   [ 0.,  0.,  1.,  0.,  1.,  0.,  0.,  0.,  1.,  0.],
   [ 1.,  0.,  0.,  0.,  0.,  0.,  1.,  1.,  0.,  0.],
   [ 0.,  1.,  0.,  0.,  1.,  0.,  0.,  1.,  0.,  0.]])

Is there a way to exploit numpy or array indexing/slicing to achieve the same result in a more pythonic/elegant way (preferably in 1 or 2 lines of code)? 有没有一种方法可以利用numpy或数组索引/切片以更pythonic /优雅的方式(最好在1或2行代码中)获得相同的结果?

Once you have the arr2d initialized with arr2d = np.zeros((10,10)) , you can use a vectorized approach with a two-liner like so - 一旦你的arr2d与初始化arr2d = np.zeros((10,10))您可以使用矢量方法有two-liner像这样-

# Generate random unique 3 column indices for 10 rows
idx = np.random.rand(10,10).argsort(1)[:,:3]

# Assign them into initialized array
arr2d[np.arange(10)[:,None],idx] = 1

Or cramp in everything for a one-liner if you like it that way - 或者,如果您喜欢那样的话,可以抽成一排的所有物品:

arr2d[np.arange(10)[:,None],np.random.rand(10,10).argsort(1)[:,:3]] = 1

Sample run - 样品运行-

In [11]: arr2d = np.zeros((10,10))  # Initialize array

In [12]: idx = np.random.rand(10,10).argsort(1)[:,:3]

In [13]: arr2d[np.arange(10)[:,None],idx] = 1

In [14]: arr2d # Verify by manual inspection
Out[14]: 
array([[ 0.,  1.,  0.,  1.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  1.,  1.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  1.,  0.,  1.,  0.,  1.],
       [ 0.,  1.,  1.,  0.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  1.,  1.,  0.,  0.,  0.,  1.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  1.,  0.,  0.,  0.,  1.,  0.,  1.],
       [ 1.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  1.,  0.],
       [ 1.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  1.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  1.]])

In [15]: arr2d.sum(1) # Verify by counting ones in each row
Out[15]: array([ 3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.])

Note : If you are looking for performance, I would suggest going with a np.argpartition based approach as listed in this other post . 注意:如果您正在寻找性能,我建议您使用np.argpartition this other post列出的基于np.argpartition的方法。

Use answers from this question to generate non-repeating random numbers. 使用此问题的答案来生成非重复的随机数。 You can use random.sample from Python's random module, or np.random.choice . 您可以使用random.sample从Python的random模块,或np.random.choice

So, just a small modification to your code: 因此,只需对您的代码进行少量修改:

>>> import numpy as np
>>> for row in arr2d:
...     rand_cols = np.random.choice(range(10), 3, replace=False)
...     # Or the python standard lib alternative (use `import random`)
...     # rand_cols = random.sample(range(10), 3)
...     row[rand_cols] = 1
...
>>> arr2d
array([[ 0.,  0.,  0.,  0.,  0.,  1.,  1.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  1.,  1.],
       [ 1.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  1.,  0.],
       [ 0.,  0.,  1.,  1.,  0.,  0.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  1.,  1.],
       [ 0.,  0.,  1.,  1.,  0.,  0.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1.,  1.,  1.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  1.,  1.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  1.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  1.,  1.,  0.,  0.,  0.,  0.,  1.,  0.]])

I don't think you can really leverage column slicing here to set values to 1, unless you're generating the randomized array from scratch. 我不认为您真的可以利用此处的列切片设置为1,除非您是从头开始生成随机数组。 This is because your column indices are random for each row . 这是因为您的列索引对于每一行都是随机 You're better off leaving it in the form of a loop for readability. 您最好将其以循环的形式保留,以提高可读性。

I'm not sure how good this would be in terms of performance, but it's fairly concise. 我不确定这在性能方面有多好,但是相当简洁。

arr2d[:, :3] = 1
map(np.random.shuffle, arr2d)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM