简体   繁体   English

通过内部数据索引和顺序访问数组元素

[英]Accessing array elements by internal data index and order

This algorithmic problem is a bit too complex for me: 这个算法问题对我来说有点太复杂了:

I have an NumPy array of data which is internally indexed in a peculiar way. 我有一个NumPy数据数组,它以一种特殊的方式在内部建立索引。 The data is an output of two arrays spliced together (which I don't have), with distinct ordering. 数据是拼接在一起的两个数组的输出(我没有),它们的顺序不同。 I set the parameter max to be a positive integer, and the data output has the index format 我将参数max设置为正整数,并且数据输出具有索引格式

[ 00 10 20 ... max0 11 12 ... max1 22 23....max2 33....max max ]

The parameter max determines the output of the array (ie the length of the array) and the ordering. 参数max确定数组的输出(即数组的长度)和顺序。

For several examples, for max=2 , the data is of the order 对于几个示例,对于max=2 ,数据的顺序为

[00 10 20 11 21 22]

Setting max=3 gives 设置max=3给出

[00 10 20 30 11 21 31 22 32 33]

And max=4 is max=4

[00 10 20 30 40 11 21 31 41 22 32 42 33 43 44]

And so on. 等等。

I would like to write an algorithm to make a list/array of only the 3x values, ie the values with first index 3. That is, I would only like to access certain data values, organized by the first index. 我想编写一种算法,使列表/数组仅包含3x值,即具有第一个索引3的值。也就是说,我只想访问由第一个索引组织的某些数据值。

However, this is determined by the parameter max . 但是,这由参数max决定。 As you can see, this determines where the datum is placed by the array index. 如您所见,这确定了数组索引将基准放置在何处。 My only idea is to make some sort of sorting tree, but I am not sure how to execute that with this max parameter. 我唯一的想法是制作某种排序树,但是我不确定如何使用此max参数执行该排序树。

This list comprehension (or iteration) produces the indexs that you show 此列表理解(或迭代)产生您显示的索引

[[j*10+i for j in range(i,max+1)] for i in range(max+1)]

max=2: [[0, 10, 20], [11, 21], [22]]
max=3: [[0, 10, 20, 30], [11, 21, 31], [22, 32], [33]]
max=4: [[0, 10, 20, 30, 40], [11, 21, 31, 41], [22, 32, 42], [33, 43], [44]]

These lists of lists can be flattened. 这些列表列表可以展平。 But this arrangement may make it easier to think about the problem. 但是这种安排可以使思考问题变得更加容易。

Or is it more useful to generate tuples: 还是生成元组更有用:

In [134]: [[(j,i) for j in range(i,max+1)] for i in range(max+1)]
Out[134]: 
[[(0, 0), (1, 0), (2, 0), (3, 0), (4, 0)],
 [(1, 1), (2, 1), (3, 1), (4, 1)],
 [(2, 2), (3, 2), (4, 2)],
 [(3, 3), (4, 3)],
 [(4, 4)]]

It isn't clear what you want to do with these numbers or indices, but here's an example of putting them in a 2d array: 尚不清楚您要如何处理这些数字或索引,但以下是将它们放入2d数组的示例:

In [150]: dlist=[[j*10+i for j in range(i,max+1)] for i in range(max+1)]
In [151]: ilist=[[(j,i) for j in range(i,max+1)] for i in range(max+1)]

In [152]: import itertools

In [155]: M=np.zeros((max+1,max+1),int)
In [157]: for (i,j),d in zip(itertools.chain(*ilist),itertools.chain(*dlist)):
    M[i,j]=d

In [158]: M
Out[158]: 
array([[ 0,  0,  0,  0,  0],
       [10, 11,  0,  0,  0],
       [20, 21, 22,  0,  0],
       [30, 31, 32, 33,  0],
       [40, 41, 42, 43, 44]])

The 1st 5 numbers go in the 1st column, 2nd 4 in the next, etc. 前5个数字在第一列中,第二个4在下一个列中,依此类推。

itertools.chain is one way of flattening a list of lists. itertools.chain是一种扁平化列表列表的方法。

The layout of M looks like a lower-triangle. M的布局看起来像一个下三角形。 There's a numpy function to generate those indices: 有一个numpy函数来生成这些索引:

In [176]: np.tril_indices(5)
Out[176]: 
(array([0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4], dtype=int32),
 array([0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4], dtype=int32))

So I could have filled M with: 所以我可以用以下方法填充M

data =  np.dot([10,1],np.tril_indices(5))
M[np.tril_indices(5)] = data

Digging into the code of tril_indices I find that the starting point is a mask of 1s generated by: 深入研究tril_indices的代码,我发现起点是由以下项生成的1的掩码:

I=((np.arange(max)-np.arange(max)[:,None])<0).astype(int)
array([[0, 0, 0, 0],
       [1, 0, 0, 0],
       [1, 1, 0, 0],
       [1, 1, 1, 0]])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM