使用行索引为 numpy 数组赋值

Question

Suppose I have two arrays, a=np.array([0,0,1,1,1,2]), b=np.array([1,2,4,2,6,5]) .假设我有两个 arrays, a=np.array([0,0,1,1,1,2]), b=np.array([1,2,4,2,6,5]) 。 Elements in a mean the row indices of where b should be assigned. a元素表示应分配b的行索引。 And if there are multiple elements in the same row, the values should be assigned in order.如果同一行中有多个元素，则应按顺序分配值。 So the result is a 2D array c :所以结果是一个二维数组c ：

c = np.zeros((3, 4))
counts = {k:0 for k in range(3)}
for i in range(a.shape[0]):
    c[a[i], counts[a[i]]]=b[i]
    counts[a[i]]+=1
print(c)

Is there a way to use some fancy indexing method in numpy to get such results faster (without a for loop) in case these arrays are big.如果这些 arrays 很大，有没有办法在 numpy 中使用一些花哨的索引方法来更快地获得这样的结果（没有 for 循环）。

Answer 1

I had to run your code to actually see what it produced.我必须运行你的代码才能真正看到它产生了什么。 There are limits to what I can 'run' in my head.我可以在脑海中“奔跑”的东西是有限度的。

In [230]: c                                                                                            
Out[230]: 
array([[1., 2., 0., 0.],
       [4., 2., 6., 0.],
       [5., 0., 0., 0.]])
In [231]: counts                                                                                       
Out[231]: {0: 2, 1: 3, 2: 1}

Omitting this information may be delaying possible answers.省略此信息可能会延迟可能的答案。 'vectorization' requires thinking in whole-array terms, which is easiest if I can visualize the result, and look for a pattern. “向量化”需要从整个数组的角度进行思考，如果我可以可视化结果并寻找模式，这是最简单的。

This looks like a `padding` problem.这看起来像一个`padding`问题。

In [260]: u, c = np.unique(a, return_counts=True)                                                      
In [261]: u                                                                                            
Out[261]: array([0, 1, 2])
In [262]: c                                                                                            
Out[262]: array([2, 3, 1])      # cf with counts

Load data with rows of different sizes into Numpy array 将不同大小行的数据加载到 Numpy 数组中

Working from previous padding questions, I can construct a mask:根据之前的填充问题，我可以构建一个掩码：

In [263]: mask = np.arange(4)<c[:,None]                                                                
In [264]: mask                                                                                         
Out[264]: 
array([[ True,  True, False, False],
       [ True,  True,  True, False],
       [ True, False, False, False]])

and use that to assign the b values to c :并使用它将b值分配给c ：

In [265]: c = np.zeros((3,4),int)                                                                      
In [266]: c[mask] = b                                                                                  
In [267]: c                                                                                            
Out[267]: 
array([[1, 2, 0, 0],
       [4, 2, 6, 0],
       [5, 0, 0, 0]])

Since a is already sorted we might get the counts faster than with unique .由于a已经排序，我们可能会比使用unique更快地获得计数。 Also it will have problems if a doesn't have any values for some row(s).如果a某些行没有任何值，也会出现问题。

使用行索引为 numpy 数组赋值

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-04-10 06:12:46

This looks like a `padding` problem.这看起来像一个`padding`问题。

使用行索引为 numpy 数组赋值

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-04-10 06:12:46

This looks like a padding problem.这看起来像一个padding问题。

解决方案1
2 已采纳 2020-04-10 06:12:46

This looks like a `padding` problem.这看起来像一个`padding`问题。